Bespoke_17k_lora_checkpoint-225 / README.md

tongliuphysics

Upload folder using huggingface_hub

84434a3 verified 4 months ago

preview code

raw

history blame contribute delete

2.74 kB

metadata

library_name: peft
license: other
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
  - llama-factory
  - lora
  - generated_from_trainer
model-index:
  - name: Bespoke_17k_lora
    results: []

Bespoke_17k_lora

This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct on the Bespoke_17k dataset. It achieves the following results on the evaluation set:

Loss: 0.5167

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 16
total_train_batch_size: 64
total_eval_batch_size: 4
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
0.8425	0.1290	32	0.7648
0.7261	0.2580	64	0.6592
0.6559	0.3870	96	0.5983
0.6316	0.5160	128	0.5707
0.6236	0.6450	160	0.5557
0.6061	0.7740	192	0.5463
0.593	0.9030	224	0.5396
0.5771	1.0282	256	0.5375
0.5953	1.1572	288	0.5316
0.5735	1.2862	320	0.5289
0.5752	1.4152	352	0.5264
0.5903	1.5442	384	0.5242
0.5662	1.6732	416	0.5225
0.5656	1.8022	448	0.5209
0.574	1.9312	480	0.5199
0.5692	2.0564	512	0.5193
0.5656	2.1854	544	0.5183
0.5654	2.3144	576	0.5177
0.5664	2.4434	608	0.5173
0.5714	2.5724	640	0.5170
0.5656	2.7014	672	0.5168
0.5681	2.8304	704	0.5168
0.5541	2.9594	736	0.5167

Framework versions

PEFT 0.15.2
Transformers 4.52.4
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.4