cpr-bert

This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1389

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_steps: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.3608 0.0477 500 1.3563
1.3280 0.0953 1000 1.3294
1.3066 0.1430 1500 1.3178
1.3036 0.1907 2000 1.3114
1.3019 0.2384 2500 1.3034
1.3055 0.2860 3000 1.3068
1.2977 0.3337 3500 1.3027
1.2974 0.3814 4000 1.2931
1.2809 0.4291 4500 1.2788
1.2733 0.4767 5000 1.2729
1.2574 0.5244 5500 1.2665
1.2522 0.5721 6000 1.2587
1.2495 0.6198 6500 1.2552
1.2402 0.6674 7000 1.2473
1.2463 0.7151 7500 1.2413
1.2397 0.7628 8000 1.2377
1.2284 0.8105 8500 1.2339
1.2226 0.8581 9000 1.2308
1.2229 0.9058 9500 1.2246
1.2206 0.9535 10000 1.2226
1.2131 1.0011 10500 1.2182
1.2139 1.0488 11000 1.2163
1.1971 1.0965 11500 1.2108
1.2019 1.1442 12000 1.2082
1.1946 1.1918 12500 1.2055
1.1983 1.2395 13000 1.2014
1.1950 1.2872 13500 1.1999
1.1955 1.3349 14000 1.1956
1.1881 1.3825 14500 1.1956
1.1824 1.4302 15000 1.1924
1.1836 1.4779 15500 1.1889
1.1676 1.5256 16000 1.1884
1.1792 1.5732 16500 1.1823
1.1764 1.6209 17000 1.1839
1.1758 1.6686 17500 1.1817
1.1631 1.7163 18000 1.1769
1.1700 1.7639 18500 1.1755
1.1651 1.8116 19000 1.1755
1.1634 1.8593 19500 1.1722
1.1646 1.9070 20000 1.1745
1.1633 1.9546 20500 1.1686
1.1674 2.0023 21000 1.1659
1.1492 2.0500 21500 1.1633
1.1486 2.0976 22000 1.1629
1.1389 2.1453 22500 1.1624
1.1518 2.1930 23000 1.1609
1.1524 2.2407 23500 1.1601
1.1498 2.2883 24000 1.1568
1.1467 2.3360 24500 1.1575
1.1424 2.3837 25000 1.1533
1.1392 2.4314 25500 1.1500
1.1416 2.4790 26000 1.1485
1.1340 2.5267 26500 1.1502
1.1372 2.5744 27000 1.1481
1.1436 2.6221 27500 1.1471
1.1345 2.6697 28000 1.1469
1.1292 2.7174 28500 1.1444
1.1258 2.7651 29000 1.1423
1.1264 2.8128 29500 1.1429
1.1264 2.8604 30000 1.1414
1.1366 2.9081 30500 1.1376
1.1265 2.9558 31000 1.1355
1.1315 3.0 31464 1.1389

Framework versions

  • Transformers 5.10.2
  • Pytorch 2.11.0+cu128
  • Datasets 5.0.0
  • Tokenizers 0.22.2
Downloads last month
80
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kdutia/cpr-ModernBERT

Finetuned
(1342)
this model