Milanmg's picture
Initial upload: model + data
482db35 verified
metadata
library_name: transformers
license: other
base_model: >-
  /nfs/stak/users/gautammi/my-hpc-share/workspace/research/research/RNADesign_Mine/models/Qwen2.5-0.5B-RNA
tags:
  - llama-factory
  - full
  - generated_from_trainer
model-index:
  - name: sft
    results: []

sft

This model is a fine-tuned version of /nfs/stak/users/gautammi/my-hpc-share/workspace/research/research/RNADesign_Mine/models/Qwen2.5-0.5B-RNA on the V4_phase2_train dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4317

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 512
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
1.1666 0.0218 100 1.1604
1.1794 0.0436 200 1.1023
1.1542 0.0654 300 1.0963
1.1265 0.0871 400 1.0635
0.8963 0.1089 500 0.8135
0.7587 0.1307 600 0.6705
0.6947 0.1525 700 0.6066
0.5813 0.1743 800 0.5263
0.5319 0.1961 900 0.4963
0.503 0.2178 1000 0.4993
0.4843 0.2396 1100 0.4703
0.4691 0.2614 1200 0.4688
0.4602 0.2832 1300 0.4593
0.4495 0.3050 1400 0.4542
0.4435 0.3268 1500 0.4499
0.4351 0.3485 1600 0.4446
0.4335 0.3703 1700 0.4409
0.4259 0.3921 1800 0.4384
0.4254 0.4139 1900 0.4347
0.4193 0.4357 2000 0.4358
0.4164 0.4575 2100 0.4329
0.4142 0.4793 2200 0.4327
0.4119 0.5010 2300 0.4287
0.4109 0.5228 2400 0.4288
0.4117 0.5446 2500 0.4306
0.4073 0.5664 2600 0.4350
0.4062 0.5882 2700 0.4280
0.4037 0.6100 2800 0.4277
0.4054 0.6317 2900 0.4272
0.4031 0.6535 3000 0.4284
0.3998 0.6753 3100 0.4282
0.4003 0.6971 3200 0.4296
0.4021 0.7189 3300 0.4282
0.3982 0.7407 3400 0.4296
0.3988 0.7624 3500 0.4298
0.3988 0.7842 3600 0.4299
0.3949 0.8060 3700 0.4309
0.3961 0.8278 3800 0.4298
0.3952 0.8496 3900 0.4307
0.397 0.8714 4000 0.4310
0.3935 0.8931 4100 0.4307
0.3931 0.9149 4200 0.4322
0.3942 0.9367 4300 0.4313
0.3951 0.9585 4400 0.4317
0.3922 0.9803 4500 0.4317

Framework versions

  • Transformers 4.56.1
  • Pytorch 2.4.1+cu121
  • Datasets 4.0.0
  • Tokenizers 0.22.0