Single card 4090 fine-tuning DeepSeek-R1-32B

Written by
Iris Vance
Updated on:July-13th-2025
Recommendation

Fine-tune the DeepSeek-R1-Distill-Qwen-32B model on a single RTX 4090 card to achieve efficient medical data processing.

Core content:
1. How to fine-tune the DeepSeek-R1-Distill-Qwen-32B model on a single RTX 4090 card
2. Use unsloth and lora technology to optimize video memory usage and achieve model fine-tuning
3. Full fine-tuning experimental results and code sharing completed on the Beilian cloud computing platform

Yang Fangxian
Founder of 53AI/Most Valuable Expert of Tencent Cloud (TVP)
Using the same method as in the previous article, fine-tuning DeepSeek-R1-Distill-Qwen-14B with unsloth and medical data on a single RTX 4090, you can also fine-tune deepseek-ai/DeepSeek-R1-Distill-Qwen-32B on a single RTX 4090 with 24G video memory; even though the model's weight file size has reached 62G, this is because the quantization fine-tuning of unsloth and lora and the fine-tuning optimization of some parameters can significantly save video memory usage.
The fine-tuning results in the previous article were not good because max_steps = 60 was set  to limit the execution to 60 steps in order to complete the experiment quickly. After removing this parameter, SFTTrainer can automatically calculate the number of fine-tuning steps based on the amount of data. This time,  the full amount of 24772 data in the FreedomIntelligence /medical-o1-reasoning-SFT  dataset was fine-tuned, and the epoch was the default value of 3. The fine-tuning results are as follows:
  • Total steps: 9288
  • Total training epochs: 3.0 
  • Data volume per round: 24,772 records
  • Training time: Total 28 hours, 28 minutes, 37 seconds (102517.8411 seconds)

This training was completed on the BeiLian cloud computing platform (https://cloud.lccomputing.com).
The complete training code is as follows:
import  wandb# Log in to wandb.ai for experiment trackingwandb.login(key= "Place your token on the wandb.ai website" )# Initialize the wandb projectrun = wandb.init(    project= 'Lora-R1-Distill-Qwen on Medical COT Dataset' ,    job_type = "training" ,    anonymous= "allow")
####################################################################################################1. Load the model
# Load the model using unsloth optimized FastLanguageModelfrom unsloth  import  FastLanguageModelmax_seq_length =  4096  # Maximum sequence lengthdtype = None # data type, None means automatic selectionload_in_4bit = True # Load the model using 4 bit quantization to save video memory
# Load the pre-trained model and tokenizermodel, tokenizer = FastLanguageModel.from_pretrained(    #model_name =  "unsloth/DeepSeek-R1-Distill-Qwen-7B" ,    model_name =  "/models/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" ,    local_files_only=True, # avoid networking    max_seq_length = max_seq_length,    dtype = dtype,    load_in_4bit = load_in_4bit,    #token = hf_token, )print (model)
####################################################################################################2.  Define the prompt template and do an inference before fine-tuning
prompt_style =  """Following are instructions describing the task, as well as input to provide more context.  Please write a response that appropriately completes this request.  Before answering, think about the question carefully and create a step-by-step chain of thought to ensure your response is logical and accurate.  ### Instruction:  You are a medical professional with expertise in clinical reasoning, diagnosis, and treatment planning.  Please answer the following medical questions.  ### Question:  {}  ### Response:  <think>{}" ""train_prompt_style = prompt_style +  """  </think>  {}" ""
# Medical questions for testingquestion =  "A 70-year-old male patient was admitted to the hospital for chest pain and vomiting for 16 hours. The electrocardiogram showed ST segment elevation of 0.1~0.3mV in the inferior wall leads and right chest leads. After fluid infusion, the blood pressure dropped to 80/60mmHg. The patient had symptoms of dyspnea and inability to lie flat. Physical examination revealed a large number of bubbling sounds in both lungs. In this case, what is the most appropriate drug treatment?"
# Set the model to inference modeFastLanguageModel.for_inference(model) inputs = tokenizer([prompt_style.format(question,  "" )], return_tensors= "pt" ).to( "cuda" )
# Generate answersoutputs = model.generate(    input_ids = inputs.input_ids,    attention_mask=inputs.attention_mask,    max_new_tokens = 1200 ,    use_cache=True,)response = tokenizer.batch_decode(outputs)print ( "### Model inference results before fine-tuning: " )print (response[ 0 ].split( "### Response:" )[ 1 ])
####################################################################################################3.  Processing the dataset
EOS_TOKEN = tokenizer.eos_token # Add end marker#Formatting prompt function, used to process examples in the data setdef formatting_prompts_func(examples):    # Extract questions, thought chains and answers from examples    inputs = examples[ "Question" ] # List of medical questions    cots = examples[ "Complex_CoT" ] # Thinking chain list     outputs = examples[ "Response" ] # answer list        #Store formatted text    texts = []        # Traverse each example and combine the question, thought chain and answer into the specified format    for  input, cot, output in zip(inputs, cots, outputs):        # Use the train_prompt_style template to format the text and add a terminator        text = train_prompt_style.format(input, cot, output) + EOS_TOKEN        texts.append (text )            # Return the formatted text dictionary    return  {        "text" : texts,    }
# Load the dataset and apply formattingfrom datasets  import  load_dataset,load_from_diskdataset = load_dataset(    "json" , # specifies the data format as JSON    data_files= "/datasets/FreedomIntelligence/medical-o1-reasoning-SFT/medical_o1_sft_Chinese.json" ,    #split= "train[0:500]" , # only take the first  500  data    trust_remote_code=True # Compatible with remote code behavior)
# If DatasetDict is returned, take out  the "train"  partif  isinstance(dataset, dict):      dataset = dataset[ "train" ]    dataset = dataset.map ( formatting_prompts_func, batched = True,)print (dataset) # View the dataset structure
####################################################################################################4.  Configure training parameters and start training
model = FastLanguageModel.get_peft_model(    model,     r = 32 ,    target_modules=[        "q_proj" "k_proj" "v_proj" "o_proj"         "gate_proj" "up_proj" "down_proj" ,        ],    lora_alpha = 16 ,    lora_dropout = 0 ,      bias= "none" ,      use_gradient_checkpointing= "unsloth"     random_state = 8137 ,    use_rslora=False,      loftq_config=None,)print (model)
# Configure training parameters and initialize the trainerfrom trl  import  SFTTrainer  from transformers  import  TrainingArguments  from unsloth  import  is_bfloat16_supported  
# Initialize SFT trainertrainer = SFTTrainer(    model=model,      tokenizer=tokenizer,      train_dataset=dataset,      dataset_text_field = "text" , # name of the dataset field    max_seq_length=max_seq_length,      dataset_num_proc= 2 , # The number of parallel processes for dataset processing to improve CPU utilization    args = TrainingArguments(        per_device_train_batch_size= 2         gradient_accumulation_steps= 4 ,           warmup_steps= 5 , # warmup steps, gradually increase the learning rate        learning_rate= 2e-4 , # learning rate        lr_scheduler_type = "linear" , # Linear learning rate scheduler        # max_steps = 200 , # Maximum number of training steps (one step = processing a batch of data)        fp16=not is_bfloat16_supported(), # If bf16 is not supported, use fp16        bf16=is_bfloat16_supported(), # Use bf16 if supported        logging_steps= 10 , # log every 10 steps        optim = "adamw_8bit" , # Use the 8 -bit AdamW optimizer to save video memory and hardly affect the training effect        weight_decay= 0.01 , # Weight decay coefficient, used for regularization to prevent overfitting        seed= 8137 , # random number seed        output_dir = "outputs" , # Save model checkpoints and training logs        run_name = "medical-o1-sft-experiment" , # Explicitly set the wandb run name to avoid warnings    ),)
# Start trainingprint (f "trainer.args.max_steps: {trainer.args.max_steps}" )print (f "trainer.args.num_train_epochs: {trainer.args.num_train_epochs}" )trainer.train()print (f "Total training steps: {trainer.state.max_steps}" )print (f "Total epochs: {trainer.state.epoch}" )
####################################################################################################5.  Perform inference on the fine-tuned model
FastLanguageModel.for_inference(model)  inputs = tokenizer([prompt_style.format(question,  "" )], return_tensors= "pt" ).to( "cuda" )
# Generate answersoutputs = model.generate(    input_ids=inputs.input_ids, # Input token id sequence    attention_mask=inputs.attention_mask, # Attention mask, used to mark valid input positions    max_new_tokens= 1200 , # The maximum number of new tokens generated    use_cache=True, # Whether to use KV cache to accelerate generation)
response = tokenizer.batch_decode(outputs)print ( "### Model inference results after fine-tuning: " )print (response[ 0 ].split( "### Response:" )[ 1 ])
####################################################################################################6.  Save the model
new_model_local =  "DeepSeek-R1-Medical-COT-Qwen-32B"model.save_pretrained(new_model_local) tokenizer.save_pretrained(new_model_local)
# Save the merged 16 bit modelmodel.save_pretrained_merged(new_model_local, tokenizer, save_method =  "merged_16bit" ,)
# Save as GGUF model# model.save_pretrained_gguf( "DeepSeek-R1-Qwen-32B-Medical-COT-GGUF" , tokenizer,)

The complete log is as follows:

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(li neounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter( lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounte r(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineoun ter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineo unter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lin eounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(l ineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(linewandb: Appending key for api.wandb.ai to your netrc file: /root/.netrcwandb: W&B API key is configured. Use `wandb login --relogin` to force reloginwandb: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.wandb: Tracking run with wandb version 0.19.6wandb: Run data is saved locally in /workspace/wandb/run-20250212_150918-mvocweduwandb: Run `wandb offline` to turn off syncing.wandb: Syncing run ruby-wind-2wandb: ⭐️ View project at https://wandb.ai/xlxkming-none/Lora-R1-Distill-Qwen%20on%20Medical%20COT%20Dataset?apiKey=edb4e5ad4f056c86bc64f3ea1d5b327e88378327wandb: ? View run at https://wandb.ai/xlxkming-none/Lora-R1-Distill-Qwen%20on%20Medical%20COT%20Dataset/runs/mvocwedu?apiKey=edb4e5ad4f056c86bc64f3ea1d5b327e88378327wandb: WARNING Do NOT share these links with anyone. They can be used to claim your runs.? Unsloth: Will patch your computer to enable 2x faster free finetuning.? Unsloth Zoo will now patch everything to make training faster! INFO 02-12 15:09:30 __init__.py:190] Automatically detected platform cuda.==((====))== Unsloth 2025.2.4: Fast Qwen2 patching. Transformers: 4.48.3. \\ /| GPU: NVIDIA GeForce RTX 4090. Max memory: 23.65 GB. Platform: Linux.O^O/ \_/ \ Torch: 2.5.1+cu121. CUDA: 8.9. CUDA Toolkit: 12.1. Triton: 3.1.0\ / Bfloat16 = TRUE. FA [Xformers = 0.0.29.post1. FA2 = False] "-____-" Free Apache license: http://github.com/unslothai/unslothUnsloth: Fast downloading is enabled - ignore downloading bars which are red colored!Loading checkpoint shards: 100%|██████████| 8/8 [00:16<00:00, 2.07s/it]Unsloth 2025.2.4 patched 64 layers with 64 QKV layers,64 O layers and 64 MLP layers./models/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B does not have a padding token! Will use pad_token = <|vision_pad|>.Qwen2ForCausalLM( (model): Qwen2Model( (embed_tokens): Embedding(152064, 5120, padding_idx=151654) (layers): ModuleList( (0-63): 64 x Qwen2DecoderLayer( (self_attn): Qwen2Attention( (q_proj): Linear4bit(in_features=5120, out_features=5120, bias=True) (k_proj): Linear4bit(in_features=5120, out_features=1024, bias=True) (v_proj): Linear4bit(in_features=5120, out_features=1024, bias=True) (o_proj): Linear4bit(in_features=5120, out_features=5120, bias=False) (rotary_emb): LlamaRotaryEmbedding() ) (mlp): Qwen2MLP( (gate_proj): Linear4bit(in_features=5120, out_features=27648, bias=False) (up_proj): Linear4bit(in_features=5120, out_features=27648, bias=False) (down_proj): Linear4bit(in_features=27648, out_features=5120, bias=False) (act_fn): SiLU() ) (input_layernorm): Qwen2RMSNorm((5120,), eps=1e-05) (post_attention_layernorm): Qwen2RMSNorm((5120,), eps=1e-05) ) ) (norm): Qwen2RMSNorm((5120,), eps=1e-05) (rotary_emb): LlamaRotaryEmbedding() ) (lm_head): Linear(in_features=5120, out_features=152064, bias=False))### Model inference results before fine-tuning: <think><think>Well, this problem looks a bit complicated, but I will analyze it step by step. First, I need to understand the patient's condition and test results, then determine the possible diagnosis, and finally choose the appropriate drug treatment. The patient is a 70-year-old male who came to the doctor because of chest pain and vomiting for 16 hours. The electrocardiogram shows ST elevation in the inferior leads and right chest leads, which may indicate myocardial infarction, especially inferior and right ventricular infarction. Because inferior myocardial infarction is usually related to right coronary artery obstruction, and ST elevation in the right chest leads may involve the right ventricle. Next, the patient's blood pressure dropped to 80/60 mmHg after fluid infusion, which may mean hypotension, but the blood pressure dropped instead after fluid infusion, which may be because the heart function is impaired and cannot pump blood effectively, leading to cardiogenic shock. At the same time, the patient has difficulty breathing and cannot lie flat. Physical examination found a lot of bubbling sounds in both lungs, which may indicate pulmonary edema, especially cardiogenic pulmonary edema, because the heart cannot pump blood effectively, causing fluid to accumulate in the lungs. Now, I need to determine the specific situation of the patient. Inferior and right ventricular infarction may cause a decrease in the heart's pumping function, especially right ventricular dysfunction, which affects the heart's output and causes hypotension and pulmonary edema. In this case, the patient's hemodynamic state may be unstable and requires emergency treatment. Next, consider drug treatment. Typically, for myocardial infarction, we use antiplatelet drugs (such as aspirin), anticoagulants (such as heparin or ticagrelor), and beta-blockers, ACEIs, or ARBs to improve heart function and reduce heart workload. However, the patient now has low blood pressure, and ACEIs may not be suitable for use, because ACEIs may further lower blood pressure and cause hypotension to worsen. In addition, the patient has pulmonary edema and may need diuretics to reduce fluid accumulation in the lungs. However, diuretics may cause further reduction of blood volume, thereby aggravating hypotension, which may not be suitable for the current situation. Considering the patient's hypotension and pulmonary edema, positive inotropic drugs such as dopamine or dobutamine may be needed to enhance cardiac contractility and improve cardiac output, thereby raising blood pressure and reducing pulmonary edema. At the same time, the use of other drugs may need to be adjusted to avoid further affecting blood pressure. In addition, the patient may need mechanical ventilation support, especially if the dyspnea is severe and cannot lie flat, and noninvasive ventilation or intubation may be required. But this may be beyond the scope of current drug treatment. In summary, the patient's situation may involve an inferior and right ventricular myocardial infarction, resulting in cardiogenic shock and pulmonary edema. In this case, the most appropriate medical management may include the use of positive inotropes (such as dopamine or dobutamine) to improve cardiac function, while continuing antiplatelet and anticoagulant therapy, but carefully adjusting to avoid worsening hypotension. Diuretics may also be required to reduce pulmonary edema, but they need to be used under monitoring to prevent hypovolemia. Of course, specific situations may require further evaluation, such as cardiac ultrasound to determine the function of the right ventricle and the presence of mechanical complications, such as ventricular septal perforation or papillary muscle insufficiency. In addition, interventional treatment, such as coronary angiography and stenting, may be required to restore blood flow and improve cardiac function. But according to the problem, it is mainly in the medical management, so the focus should be on the use of positive inotropes and supportive care, while monitoring and adjusting the use of other drugs. </think>For this patient's condition, the most appropriate drug treatment is as follows: 1. **Antiplatelet and anticoagulant therapy**: Continue to use aspirin and clopidogrel (or ticagrelor), and give heparin anticoagulation to prevent further thrombosis. 2. **Positive inotropic drugs**: Use dopamine or dobutamine to enhance cardiac contractility, improve cardiac output, increase blood pressure, and reduce pulmonary edema. 3. **Diuretics**: Use diuretics (such as furosemide) under monitoring to reduce pulmonary edema, but be careful to avoid hypovolemia. 4. **Avoid the use of ACEI or ARB**: Due to the patient's low blood pressure, temporarily avoid the use of ACEI or ARB to prevent further lowering of blood pressure. 5. **Monitoring and supportive treatment**: Closely monitor the patient's vital signs, perform mechanical ventilation support if necessary, and consider interventional treatment (such as coronary angiography and stent implantation) to restore blood flow. In summary, the focus of drug treatment is to use positive inotropic drugs and supportive therapy, while continuing antiplatelet and anticoagulant therapy to improve cardiac function and hemodynamic status. <|end of sentence|>Dataset({ features: ['Question', 'Complex_CoT', 'Response', 'text'], num_rows: 24772})PeftModelForCausalLM( (base_model):LoraModel( (model): Qwen2ForCausalLM( (model): Qwen2Model( (embed_tokens): Embedding(152064, 5120, padding_idx=151654) (layers): ModuleList( (0-63): 64 x Qwen2DecoderLayer( (self_attn): Qwen2Attention( (q_proj): lora.Linear4bit( (base_layer): Linear4bit(in_features=5120, out_features=5120, bias=True) (lora_dropout): ModuleDict( (default): Identity() ) (lora_A): ModuleDict( (default): Linear(in_features=5120, out_features=32, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=32, out_features=5120, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() (lora_magnitude_vector): ModuleDict() ) . . . )...

Training process and result log:

==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1   \\ /| Num examples = 24,772 | Num Epochs = 3O^O/ \_/ \ Batch size per device = 2 | Gradient Accumulation steps = 4\/Total batch size = 8 | Total steps = 9,288 "-____-"      Number of trainable parameters = 268,435,456
trainer.args.max_steps: -1trainer.args.num_train_epochs: 3.0{ 'loss' : 2.149,  'grad_norm' : 0.16384364664554596,  'learning_rate' : 0.00019989227620381345,  'epoch' : 0.0}{ 'loss' : 1.5362,  'grad_norm' : 0.07211203873157501,  'learning_rate' : 0.0001996768286114403,  'epoch' : 0.01}{ 'loss' : 1.4647,  'grad_norm' : 0.07446285337209702,  'learning_rate' : 0.00019946138101906713,  'epoch' : 0.01}...{ 'loss' : 1.39,  'grad_norm' : 0.08653779327869415,  'learning_rate' : 0.0001977378002800819,  'epoch' : 0.04}...{ 'loss' : 1.2627,  'grad_norm' : 0.1181635782122612,  'learning_rate' : 0.00013590434126898633,  'epoch' : 0.96}...{ 'loss' : 1.1951,  'grad_norm' : 0.11674296855926514,  'learning_rate' : 0.00013224173219864268,  'epoch' : 1.02}...{ 'loss' : 1.071,  'grad_norm' : 0.1962611824274063,  'learning_rate' : 3.1843154152752344e-05,  'epoch' : 2.52}...{ 'loss' : 0.9945,  'grad_norm' : 0.17683860659599304,  'learning_rate' : 2.2794355273079824e-05,  'epoch' : 2.66}...{ 'loss' : 1.1104,  'grad_norm' : 0.21208912134170532,  'learning_rate' : 6.032532586448347e-07,  'epoch' : 2.99}{ 'loss' : 1.0957,  'grad_norm' : 0.2164667695760727,  'learning_rate' : 3.8780566627167944e-07,  'epoch' : 2.99}{ 'loss' : 1.101,  'grad_norm' : 0.21290326118469238,  'learning_rate' : 1.723580738985242e-07,  'epoch' : 3.0}
100%|██████████| 9288/9288 [28:28:37<00:00, 11.04s/it]
{ 'train_runtime' : 102517.8411,  'train_samples_per_second' : 0.725,  'train_steps_per_second' : 0.091,  'train_loss' : 1.210533706973484,  'epoch' : 3.0}Total training steps: 9288Total epochs: 2.999192636848054### Model inference results after fine-tuning:
  <think>The patient is a 70-year-old male who came to the hospital with chest pain and vomiting, which first made me think that it might be related to the heart. The electrocardiogram showed ST segment elevation in the inferior leads and right chest leads, which might mean an inferior myocardial infarction. Next, his blood pressure dropped to 80/60 mmHg after fluid infusion, which is very low, and he also had difficulty breathing, could not lie flat, and had bubbling sounds in his lungs, which made me suspect that he had cardiogenic shock or acute heart failure.
Cardiogenic shock and acute heart failure usually require rapid treatment because they are life-threatening. The first thing to think about is to stabilize his hemodynamic status. Usually, we use positive inotropic drugs in this situation, such as dobutamine, because it can increase myocardial contractility, increase cardiac output, and help improve symptoms of hypotension and dyspnea.
But wait, the patient has hypotension and pulmonary edema, which makes me think that it may not just be a simple cardiogenic shock, but also a fluid overload problem. If it is fluid overload, using a diuretic such as furosemide may be more effective because it can help remove the excess fluid, reduce pulmonary edema, and reduce the workload on the heart.
On second thought, the patient's hypotension is severe and he cannot lie flat, which may indicate that the heart is very poorly pumping. In this case, it may be more appropriate to use a positive inotropic drug to increase the force of the heart's contractions. Dobutamine can increase cardiac output while improving hypotension, which may be a better choice at this time.
Oh, by the way, the patient's electrocardiogram shows inferior ST-segment elevation, which may indicate right ventricular infarction. Right ventricular infarction may cause cardiogenic shock and requires special attention. In this case, it may be more appropriate to use dobutamine to enhance cardiac contractility and increase cardiac output.
In summary, considering the patient's severe hypotension, dyspnea, and the possibility of right ventricular infarction, it is most appropriate to use dobutamine to quickly improve the hemodynamic status. Well, this should be a wise choice.  </think>  In this case, the patient presented with hypotension, dyspnea, inability to lie flat, and bubbling sounds in the lungs, symptoms that suggest possible cardiogenic shock or acute heart failure. The electrocardiogram showed ST segment elevation in the inferior leads and right chest leads, suggesting possible inferior wall myocardial infarction, possibly involving the right ventricle.
For this situation, the most appropriate drug treatment is to use positive inotropic drugs to improve the heart's pumping function and increase cardiac output, thereby improving the symptoms of hypotension and dyspnea. Dobutamine is a commonly used positive inotropic drug that can increase myocardial contractility and cardiac output, while also dilating blood vessels to a certain extent, reducing cardiac afterload, and helping to improve the patient's hemodynamic state.
Therefore, considering the patient's current hemodynamic instability and possible right ventricular infarction, the use of dobutamine is a reasonable and necessary choice. This drug intervention can quickly help stabilize the patient's condition and buy time for subsequent treatment. <|end of sentence|>Unsloth: Merging 4bit and LoRA weights to 16bit...Unsloth: Will use up to 303.83 out of 503.72 RAM for saving.Unsloth: Saving model... This might take 5 minutes...  0%| | 0/64 [00:00<?, ?it/s]We will save to Disk and not RAM now.100%|██████████| 64/64 [01:34<00:00, 1.47s/it]Unsloth: Saving tokenizer... Done.Done.wandb:wandb: ? View run ruby-wind-2 at: https://wandb.ai/xlxkming-none/Lora-R1-Distill-Qwen%20on%20Medical%20COT%20Dataset/runs/mvocwedu?apiKey=edb4e5ad4f056c86bc64f3ea1d5b327e88378327wandb: Find logs at: wandb/run-20250212_150918-mvocwedu/logs

Peak resource usage. This is Lora rank 32:

ounter(lineounter(lineounter(lineounter(lineounter(line|=============================================+========================+========================|| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off || 76% 64C P2 392W / 450W | 24176MiB / 24564MiB | 100% Default || | | N/A |+-----------------------------------------+-------------------------+-------------------------+

I have tested Lora Rank 8 before, and it doesn’t take much less resources than Rank 32:

ounter(lineounter(lineounter(lineounter(lineounter(line|==============================================+========================+========================|| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 Off | Off || 82% 65C P2 394W / 450W | 21246MiB / 24564MiB | 100% Default || | | N/A |+-----------------------------------------+-------------------------+-------------------------+