WeClone: Fine-tuning a large language model using WeChat chat logs

Use personal WeChat chat records to fine-tune the large language model and create your own intelligent digital avatar.
Core content:
1. Full-link solution from chat records to model fine-tuning
2. Chat record training, high-quality voice cloning and WeChat robot binding functions
3. Environment construction, data preparation and personalized optimization technical guide
Fine-tune the Large Language Model (LLM) through personal WeChat chat records to create a unique digital avatar. It provides a full-link solution from chat data to model fine-tuning, from text generation to voice cloning, and from training to deployment. Let your digital avatar not only "speak your words" but also "sound like you".
Key Features
-
Chat history training: Use WeChat chat history to fine-tune the large language model and imitate the way users speak.
-
High-quality voice cloning: Using a 0.5B parameter model and a 5-second voice sample, we can generate voices with a similarity of up to 95%.
-
WeChat robot binding: connect your digital avatar to WeChat and support automatic text and voice replies.
-
Data preprocessing tool: Provides scripts to convert chat records into training data, filtering sensitive information by default.
-
Model personalization optimization: Support LoRA fine-tuning technology to make the model more in line with user language habits.
Installation and Usage
Hardware requirements
Currently, the project uses the chatglm3-6b model by default, and the LoRA method is used to fine-tune the sft stage, which requires about 16GB of video memory. You can also use other models and methods supported by LLaMA Factory, which take up less video memory, but you need to modify the systemprompt word and other related configurations of the template yourself.
Estimated video memory required:
|
|
|
|
|
|
x
|
---|---|---|---|---|---|---|
|
|
|
|
|
|
18x
|
|
|
|
|
|
|
8x
|
|
|
|
|
|
|
2x
|
|
|
|
|
|
|
x
|
|
|
|
|
|
|
x/2
|
|
|
|
|
|
|
x/4
|
Environment Construction
It is recommended to use uv, which is a very fast Python environment manager. After installing uv, you can use the following command to create a new Python environment and install dependencies. Note that this does not include the dependencies for the xcodec (audio cloning) feature:
git clone https://github.com/xming521/WeClone.git
cd WeClone
uv venv .venv --python=3.9
source .venv/bin/activate
uv pip install --group main -e .
Notice
Training and reasoning related configurations are unified in the file settings.json
Data preparation
Please use PyWxDump to extract WeChat chat records. After downloading the software and decrypting the database, click Chat Backup, export type to CSV, you can export multiple contacts or group chats, and then put the exported folder in the directory, that is, put the folders of different people's chat records together. The sample data is located in data/example_chat.csv. wxdump_tmp/exportcsv./data./data/csv
Data preprocessing
By default, the project removes the mobile phone number, ID number, email address, and website address from the data. It also provides a blocked word library called blocked_words, where you can add words and sentences that need to be filtered (the entire sentence including the blocked words will be removed by default). Execute the script to process the data. ./make_dataset/csv_to_json.py
When the same person answers multiple sentences in a row, there are three ways to deal with it:
|
|
---|---|
|
|
|
|
|
|
Model Download
First download the ChatGLM3 model from Hugging Face. If you encounter problems downloading the Hugging Face model, you can use the Moda community through the following method. Subsequent training and reasoning need to be performed first to use the Moda community model. Since the model is large, the download process is relatively long, please be patient. export USE_MODELSCOPE_HUB=1
export USE_MODELSCOPE_HUB=1 # Windows uses `set USE_MODELSCOPE_HUB=1`
git lfs install
git clone https://www.modelscope.cn/ZhipuAI/chatglm3-6b.git
The file of MoDa Community needs to be replaced with modeling_chatglm.py of Hugging Face
Configure parameters and fine-tune the model
-
(Optional) Modify settings.json and select other models downloaded locally.
-
Modify and adjust the video memory usage. per_device_train_batch_size gradient_accumulation_steps
-
You can modify parameters such as num_train_epochslora_ranklora_dropout according to the quantity and quality of your own data set.
Single card training
Run the sft stage fine-tuning, my loss only dropped to about 3.5, too much reduction may lead to overfitting, I used about 20,000 integrated valid data. src/train_sft.py
python src/train_sft.py
Multi-card training
uv pip install deepspeed
deepspeed --num_gpus=Number of GPUs to use src/train_sft.py
Actual combat exercises
Scenario 1: Simple reasoning using the browser demo
python ./src/web_demo.py
Scenario 2: Reasoning using interfaces
python ./src/api_service.py
Scenario 3: Testing with common chat questions
python ./src/api_service.py
python ./src/test_model.py
Scenario 4: Deploy to a chatbot
AstrBot Solution
AstrBot is an easy-to-use multi-platform LLM chatbot and development framework. The platform supports QQ, QQ channel, Telegram, WeChat, Qiwei, and Feishu.
Steps:
-
Deploy AstrBot
-
Deploy the messaging platform in AstrBot
-
Execute to start the API service python ./src/api_service.py
-
Add a new service provider in AstrBot, select OpenAI as the type, fill in the API Base URL according to the AstrBot deployment method (for example, docker deployment may be http://172.17.0.1:8005/v1), and fill in gpt-3.5-turbo as the model
-
After fine-tuning, tool calls are not supported. Please turn off the default tool first and send the command on the message platform: , otherwise there will be no effect after fine-tuning. /tool off reminder
-
Set the system prompt word in AstrBot according to the default_system used during fine-tuning.