You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

DAC-Denoiser-LLaMA-1B-DAC-SE2_1B_1GPU_cont_v2

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9397

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 490
  • training_steps: 24527

Training results

Training Loss Epoch Step Validation Loss
8.9257 0.0083 200 8.9732
6.7348 0.0165 400 6.7776
6.6337 0.0248 600 6.6992
6.5831 0.0330 800 6.6427
6.5035 0.0413 1000 6.5786
6.3535 0.0496 1200 6.4691
6.0956 0.0578 1400 6.2538
5.7696 0.0661 1600 5.9178
5.4102 0.0743 1800 5.6647
5.2992 0.0826 2000 5.4859
4.9603 0.0909 2200 5.1943
4.8496 0.0991 2400 4.9846
4.7874 0.1074 2600 4.8687
4.6003 0.1156 2800 4.7883
4.5398 0.1239 3000 4.6783
4.4368 0.1321 3200 4.6156
4.5215 0.1404 3400 4.5750
4.5379 0.1487 3600 4.5496
4.4139 0.1569 3800 4.5098
4.3908 0.1652 4000 4.4832
4.3592 0.1734 4200 4.4558
4.2609 0.1817 4400 4.4330
4.3614 0.1900 4600 4.4146
4.2868 0.1982 4800 4.3926
4.2852 0.2065 5000 4.3767
4.2367 0.2147 5200 4.3632
4.2743 0.2230 5400 4.3406
4.3228 0.2313 5600 4.3304
4.0882 0.2395 5800 4.3177
4.2148 0.2478 6000 4.3025
4.1705 0.2560 6200 4.2915
4.1606 0.2643 6400 4.2806
4.1353 0.2726 6600 4.2699
4.0473 0.2808 6800 4.2593
4.1826 0.2891 7000 4.2465
4.0894 0.2973 7200 4.2371
4.1944 0.3056 7400 4.2275
4.0872 0.3139 7600 4.2171
4.0388 0.3221 7800 4.2084
4.0974 0.3304 8000 4.1970
4.0866 0.3386 8200 4.1915
4.0224 0.3469 8400 4.1837
4.0804 0.3552 8600 4.1749
4.1001 0.3634 8800 4.1703
3.9772 0.3717 9000 4.1605
4.0091 0.3799 9200 4.1510
3.9754 0.3882 9400 4.1478
4.0554 0.3964 9600 4.1413
3.973 0.4047 9800 4.1336
4.024 0.4130 10000 4.1262
3.9008 0.4212 10200 4.1203
3.9837 0.4295 10400 4.1165
3.9856 0.4377 10600 4.1082
4.05 0.4460 10800 4.1063
3.9868 0.4543 11000 4.0997
3.9677 0.4625 11200 4.0953
3.9456 0.4708 11400 4.0857
3.9515 0.4790 11600 4.0807
3.8984 0.4873 11800 4.0785
3.9625 0.4956 12000 4.0744
3.9627 0.5038 12200 4.0683
3.8908 0.5121 12400 4.0630
3.8739 0.5203 12600 4.0602
3.9079 0.5286 12800 4.0548
3.9565 0.5369 13000 4.0505
3.9753 0.5451 13200 4.0475
3.9373 0.5534 13400 4.0415
3.8967 0.5616 13600 4.0386
3.8869 0.5699 13800 4.0353
3.9242 0.5782 14000 4.0333
3.8214 0.5864 14200 4.0287
3.8305 0.5947 14400 4.0252
3.8868 0.5953 14600 4.0214
3.8755 0.6034 14800 4.0177
3.8983 0.6116 15000 4.0148
3.8179 0.6197 15200 4.0113
3.9368 0.6279 15400 4.0076
3.8364 0.6360 15600 4.0057
3.9392 0.6442 15800 4.0010
3.8992 0.6523 16000 3.9977
3.8775 0.6605 16200 3.9959
3.9484 0.6686 16400 3.9936
3.827 0.6768 16600 3.9890
3.9737 0.6850 16800 3.9865
3.8611 0.6931 17000 3.9835
3.8686 0.7013 17200 3.9811
3.8596 0.7094 17400 3.9789
3.7207 0.7176 17600 3.9752
3.8712 0.7257 17800 3.9737
3.8364 0.7339 18000 3.9716
3.9145 0.7420 18200 3.9699
3.8044 0.7502 18400 3.9677
3.7504 0.7583 18600 3.9654
3.8542 0.7665 18800 3.9634
3.8688 0.7747 19000 3.9614
3.8832 0.7828 19200 3.9593
3.834 0.7910 19400 3.9579
3.8855 0.7991 19600 3.9561
3.8744 0.8073 19800 3.9557
3.7769 0.8154 20000 3.9535
3.8508 0.8236 20200 3.9525
3.7791 0.8317 20400 3.9510
3.8041 0.8399 20600 3.9489
3.7265 0.8480 20800 3.9479
3.8421 0.8562 21000 3.9470
3.7523 0.8643 21200 3.9462
3.8736 0.8725 21400 3.9458
3.8183 0.8807 21600 3.9449
3.7868 0.8888 21800 3.9438
3.7659 0.8970 22000 3.9431
3.791 0.9051 22200 3.9427
3.7429 0.9133 22400 3.9416
3.7534 0.9214 22600 3.9414
3.7807 0.9296 22800 3.9409
3.771 0.9377 23000 3.9408
3.8103 0.9459 23200 3.9404
3.8326 0.9540 23400 3.9402
3.8351 0.9622 23600 3.9400
3.8171 0.9704 23800 3.9399
3.8372 0.9785 24000 3.9398
3.8325 0.9867 24200 3.9397
3.8412 0.9948 24400 3.9397

Framework versions

  • Transformers 4.56.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.0
Downloads last month
4
Safetensors
Model size
0.9B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results