Woter AI detection.Hurry - ends Jun 28th

New Year Sales :up to 80% OFF

AI Humanize AI Translator Bypass AI AI Rewriter AI Detector

PRICING

TRY FOR FREE

WeClone: Fine-tuning a large language model using WeChat chat logs

Written by

Caleb Hayes

Updated on:June-19th-2025

Fine-tune the Large Language Model (LLM) through personal WeChat chat records to create a unique digital avatar. It provides a full-link solution from chat data to model fine-tuning, from text generation to voice cloning, and from training to deployment. Let your digital avatar not only "speak your words" but also "sound like you".

Key Features

Chat history training: Use WeChat chat history to fine-tune the large language model and imitate the way users speak.
High-quality voice cloning: Using a 0.5B parameter model and a 5-second voice sample, we can generate voices with a similarity of up to 95%.
WeChat robot binding: connect your digital avatar to WeChat and support automatic text and voice replies.
Data preprocessing tool: Provides scripts to convert chat records into training data, filtering sensitive information by default.
Model personalization optimization: Support LoRA fine-tuning technology to make the model more in line with user language habits.

Installation and Usage

Hardware requirements

Currently, the project uses the chatglm3-6b model by default, and the LoRA method is used to fine-tune the sft stage, which requires about 16GB of video memory. You can also use other models and methods supported by LLaMA Factory, which take up less video memory, but you need to modify the systemprompt word and other related configurations of the template yourself.

Estimated video memory required:

method	Accuracy	7B	14B	Page 30	70 bytes	x B
Complete (or bf16fp16 )	32	120GB	240GB	600GB	1200GB	18x National Standard
Complete ( pure_bf16 )	16	60GB	120GB	300GB	600GB	8x National Standard
Freeze/LoRA/GaLore/APOLLO/BAdam	16	16 GB	32GB	64GB	160GB	2x National Standard
QLoRA	8	10GB	20GB	40GB	80GB	x National Standard
QLoRA	4	6GB	12GB	24GB	48GB	x/2 National Standard
QLoRA	2	4GB	8GB	16 GB	24GB	x/4 National Standard

Environment Construction

It is recommended to use uv, which is a very fast Python environment manager. After installing uv, you can use the following command to create a new Python environment and install dependencies. Note that this does not include the dependencies for the xcodec (audio cloning) feature:

git clone https://github.com/xming521/WeClone.git
cd WeClone
uv venv .venv --python=3.9
source .venv/bin/activate
uv pip install --group main -e .

Notice

Training and reasoning related configurations are unified in the file settings.json

Data preparation

Please use PyWxDump to extract WeChat chat records. After downloading the software and decrypting the database, click Chat Backup, export type to CSV, you can export multiple contacts or group chats, and then put the exported folder in the directory, that is, put the folders of different people's chat records together. The sample data is located in data/example_chat.csv. wxdump_tmp/exportcsv./data./data/csv

Data preprocessing

By default, the project removes the mobile phone number, ID number, email address, and website address from the data. It also provides a blocked word library called blocked_words, where you can add words and sentences that need to be filtered (the entire sentence including the blocked words will be removed by default). Execute the script to process the data. ./make_dataset/csv_to_json.py

When the same person answers multiple sentences in a row, there are three ways to deal with it:

document	Processing
csv_to_json.py	Connect with commas
csv_to_json - single sentence answer.py (deprecated)	Only the longest answer is selected as the final data
csv_to_json-single sentence multiple rounds.py	Placed in the prompt word 'history'

Model Download

First download the ChatGLM3 model from Hugging Face. If you encounter problems downloading the Hugging Face model, you can use the Moda community through the following method. Subsequent training and reasoning need to be performed first to use the Moda community model. Since the model is large, the download process is relatively long, please be patient. export USE_MODELSCOPE_HUB=1

export USE_MODELSCOPE_HUB=1 # Windows uses `set USE_MODELSCOPE_HUB=1`
git lfs install
git clone https://www.modelscope.cn/ZhipuAI/chatglm3-6b.git

The file of MoDa Community needs to be replaced with modeling_chatglm.py of Hugging Face

Configure parameters and fine-tune the model

(Optional) Modify settings.json and select other models downloaded locally.
Modify and adjust the video memory usage. per_device_train_batch_size gradient_accumulation_steps
You can modify parameters such as num_train_epochslora_ranklora_dropout according to the quantity and quality of your own data set.

Single card training

Run the sft stage fine-tuning, my loss only dropped to about 3.5, too much reduction may lead to overfitting, I used about 20,000 integrated valid data. src/train_sft.py

python src/train_sft.py

Multi-card training

uv pip install deepspeed
deepspeed --num_gpus=Number of GPUs to use src/train_sft.py

Actual combat exercises

Scenario 1: Simple reasoning using the browser demo

python ./src/web_demo.py

Scenario 2: Reasoning using interfaces

python ./src/api_service.py

Scenario 3: Testing with common chat questions

python ./src/api_service.py
python ./src/test_model.py

Scenario 4: Deploy to a chatbot

AstrBot Solution

AstrBot is an easy-to-use multi-platform LLM chatbot and development framework. The platform supports QQ, QQ channel, Telegram, WeChat, Qiwei, and Feishu.

Steps:

Deploy AstrBot
Deploy the messaging platform in AstrBot
Execute to start the API service python ./src/api_service.py
Add a new service provider in AstrBot, select OpenAI as the type, fill in the API Base URL according to the AstrBot deployment method (for example, docker deployment may be http://172.17.0.1:8005/v1), and fill in gpt-3.5-turbo as the model
After fine-tuning, tool calls are not supported. Please turn off the default tool first and send the command on the message platform: , otherwise there will be no effect after fine-tuning. /tool off reminder
Set the system prompt word in AstrBot according to the default_system used during fine-tuning.

WeClone: ​​Fine-tuning a large language model using WeChat chat logs