A highly controversial open source project, "WeChat Clone" has become popular!

Explore the mystery of digital avatars and experience the novel feeling of dialogue with memories.
Core content:
1. Introduction to the WeClone project, how to clone digital avatars through WeChat chat records
2. Analysis of core functions, including LLM fine-tuning, voice cloning and multi-platform deployment
3. Outlook of application scenarios, from personal assistants to content creation, to the possibility of digital immortality
Do you have a chat window in your WeChat? It hasn't had any new messages pop up for a long time, but you often open it and scroll through it over and over again in the middle of the night?
If you could use these chat records to clone the other person's "digital avatar", save their tone of voice, style, unique catchphrases, and even send you voice messages, what would you choose?
Recently, a new open-source project WeClone was launched on GitHub, which makes it possible for the person in your memory to live forever in the digital world.
WeClone fine-tunes the Large Language Model (LLM) through personal WeChat chat records to create a personalized digital avatar.
It provides a full-link solution from text generation to voice cloning, from training to deployment, allowing the digital avatar to not only speak for the person, but also sound like the person himself.
In addition to keeping the person in your memory, you can also create your own digital avatar. Have you ever thought about what it would be like to chat with yourself? Would you like to chat with yourself?
The playability of digital human technology is indeed very high. Once it was launched, it attracted a lot of attention from netizens both on the intranet and the Internet. Many netizens also had their imaginations run wild.
Project guide: https://github.com/mtcto/weclone
Let’s first take a look at the core functions of WeClone.
Core Features
Fine-tuning LLM using WeChat chat records
WeClone supports exporting WeChat chat records and formatting them into question-and-answer format for model fine-tuning.
In terms of model fine-tuning, WeClone supports low-resource fine-tuning of mainstream 0.5B–7B scale models based on LoRA, including ChatGLM3-6B, Qwen2.5-7B, etc. Effectively capture the user's language habits and expressions.
Model training requires about 16GB of video memory. The video memory requirement is controllable, the training efficiency is high, and it meets the needs of small sample and low resource scenarios.
The estimated video memory required is as follows:
Using WeChat Voice Message + Spark-TTS Model to Achieve High-Quality Voice Cloning
The project's supporting submodule WeClone - audio (https://github.com/xming521/WeClone/tree/master/WeClone-audio) is based on the lightweight Tacotron or WavLM model. Using a 0.5B parameter model and a 5-second voice sample, it can clone sounds with a similarity of up to 95%, further enhancing the realism of the digital avatar.
Multi-platform deployment
Through the AstrBot framework, the digital avatar is deployed to multiple chat platforms such as WeChat, QQ, Telegram, WeChat for Business, Lark, etc. A single command can quickly start the conversation with the digital avatar in real time.
Possible application scenarios
Personal assistant customization: When you are busy, your digital avatar can reply to messages and handle daily affairs on your behalf, such as writing emails and replying to comments.
Content creation: quickly produce personalized text content in a specific style to help you operate multiple accounts with the same style, such as writing tweets, scripts, and commentaries.
Digital immortality: Create a digital avatar of yourself or others to achieve eternal life.
Core Module Introduction
WeClone's digital clone full-link core module consists of three parts:
Data export and preprocessing → LoRA model fine-tuning → Multi-platform deployment
Next, let’s look at the technical highlights of each module.
1. Data export and preprocessing
WeClone first converts the CSV/SQLite files exported by WeChat into standard JSON files. Then it performs text cleaning, mainly to remove noise and filter out sensitive information. Finally, it segments the conversation information, annotates the chat records in segments, and retains the timestamp,
WeClone uses ChatGLM3-6B as the base model and performs fine-tuning in the SFT (Supervised Fine-Tuning) stage based on the LoRA framework.
Key highlights include:
Use low-rank adapters to significantly reduce trainable parameters.
Compatible with stand-alone/distributed training, and supports multi-card training acceleration.
WeClone uses FastAPI/Flask to package the fine-tuned model, supports GPU/CPU hybrid deployment, multi-platform login, and supports custom parameters.
Installation and deployment tutorial
Environment Construction
It is recommended to use uv, which is a very fast Python environment manager. After installing uv, you can use the following command to create a new Python environment and install dependencies. Note that this does not include the dependencies for the xcodec (audio cloning) feature:
git clone https://github.com/xming521/WeClone.git
cd WeClone
uv venv .venv --python=3.9
source .venv/bin/activate
uv pip install --group main -e .
Note
Training and inference related configurations are unified in the file settings.json.
Data preparation
Please use PyWxDump to extract WeChat chat records. After downloading the software and decrypting the database, click Chat Backup, export type to CSV, you can export multiple contacts or group chats, and then export the files in wxdump_tmp/export
The csv folder is placed in ./data
Directory, that is, the folders of chat records of different people are placed together ./data/csv
The sample data is located at data/example_chat.csv
.
Data preprocessing
By default, the project removes the mobile phone number, ID number, email address, and website address from the data. It also provides a blocked_words dictionary, where you can add words and sentences that need to be filtered (the entire sentence including the blocked words will be removed by default). ./make_dataset/csv_to_json.py
The script processes the data.
When the same person answers multiple sentences in a row, there are three ways to deal with it:
Model Download
First download the ChatGLM3 model from Hugging Face. If you encounter problems downloading the Hugging Face model, you can use the Moda community through the following method. Subsequent training and reasoning need to be executed first. export USE_MODELSCOPE_HUB=1
Let’s use the model of the MoDa community.
Since the model is large, the download process will take a long time, please be patient.
export USE_MODELSCOPE_HUB=1 # Windows uses `set USE_MODELSCOPE_HUB=1`
git lfs install
git clone https://www.modelscope.cn/ZhipuAI/chatglm3-6b.git
The Magic Community modeling_chatglm.py
The file needs to be replaced with Hugging Face's.
Configure parameters and fine-tune the model
(Optional) Modify settings.json and select other models downloaded locally.
Revise
per_device_train_batch_size
as well asgradient_accumulation_steps
To adjust the video memory usage.You can modify it according to the quantity and quality of your own data set
num_train_epochs
,lora_rank
,lora_dropout
And other parameters.
Single card training
run src/train_sft.py
After fine-tuning the SFT stage, my loss only dropped to about 3.5. Reducing it too much may lead to overfitting. I used about 20,000 integrated valid data.
python src/train_sft.py
Multi-card training
uv pip install deepspeed
deepspeed --num_gpus=Number of GPUs to use src/train_sft.py
Simple reasoning using the browser demo
python ./src/web_demo.py
Reasoning with interfaces
python ./src/api_service.py
Use the Common Chat Questions Test
python ./src/api_service.py
python ./src/test_model.py
Deploy to chatbot
AstrBot Solution
AstrBot is an easy-to-use multi-platform LLM chatbot and development framework.
Steps:
Deploy AstrBot Deploy the messaging platform in AstrBot implement python ./src/api_service.py
, start the api serviceAdd a new service provider in AstrBot, select OpenAI as the type, fill in the API Base URL according to the AstrBot deployment method (for example, docker deployment may be http://172.17.0.1:8005/v1), and fill in gpt-3.5-turbo as the model After fine-tuning, tool calls are not supported. Please turn off the default tool first and send instructions on the message platform: /tool off reminder
, otherwise there will be no fine-tuning effect.Set the system prompt word in AstrBot according to the default_system used during fine-tuning.