5 Techniques for Fine-tuning Large Language Models (LLMs)

Written by

Jasper Cole

Updated on:July-11th-2025

In the field of artificial intelligence, fine-tuning large language models (LLMs) has always been an important means to improve model performance. Traditionally, fine-tuning LLMs means adjusting billions of parameters, which requires not only powerful computing power but also a lot of resources. However, with the emergence of some innovative methods, this process has undergone a revolutionary change.

Today, we will introduce you to five latest LLMs fine-tuning techniques in a visual way, so that you can easily understand these complex technical concepts.

1️⃣ LoRA (Low Rank Adaptation)

LoRA introduces two low-rank matrices A and B that work in conjunction with the main weight matrix W. By adjusting these two smaller matrices instead of directly adjusting the huge matrix W, the fine-tuning process becomes more efficient and manageable. This approach not only reduces the amount of computation, but also reduces storage requirements.

2️⃣ LoRA-FA (LoRA with frozen A)

LoRA-FA is a further development of LoRA that reduces computational requirements by freezing matrix A. In this method, only matrix B is adjusted, which not only reduces the activation memory requirements but also maintains the stability of the model. This approach is particularly suitable for resource-limited environments.

3️⃣ VeRA (Vectorized LoRA)

VeRA focuses on efficiency, and its core idea is to fix and share matrices A and B in all layers. This method achieves fine-tuning by using tiny, trainable scaling vectors in each layer, which makes it very memory-friendly. VeRA is particularly suitable for application scenarios that require fast fine-tuning and low memory consumption.

4️⃣ Delta-LoRA

Delta-LoRA is a variant of LoRA that dynamically adjusts parameters by adding the difference (delta) between the products of matrices A and B to the main weight matrix W during the training step. This approach provides a dynamic but controlled way of updating parameters, enabling fast adaptation while maintaining model stability.

5️⃣ LoRA+ (Enhanced LoRA)

LoRA+ is an optimized version of LoRA, in which the matrix B is given a higher learning rate. This adjustment makes the learning process faster and more efficient. LoRA+ is particularly suitable for application scenarios that require fast convergence and high performance.

Summarize

With the advent of these innovative techniques, fine-tuning large language models has become more efficient and feasible. LoRA, LoRA-FA, VeRA, Delta-LoRA, and LoRA+ each offer unique advantages for different application scenarios and resource constraints. With these methods, researchers and engineers can more flexibly adjust and optimize their models to achieve better performance in a variety of tasks.