cpr-modernBERT-C

This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8875

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.0847 0.0478 500 1.0791
1.0631 0.0955 1000 1.0546
1.0360 0.1433 1500 1.0357
1.0323 0.1911 2000 1.0269
1.0190 0.2389 2500 1.0166
1.0142 0.2866 3000 1.0045
0.9938 0.3344 3500 0.9997
0.9956 0.3822 4000 0.9899
0.9850 0.4299 4500 0.9859
0.9697 0.4777 5000 0.9767
0.9751 0.5255 5500 0.9746
0.9626 0.5733 6000 0.9682
0.9609 0.6210 6500 0.9637
0.9569 0.6688 7000 0.9594
0.9582 0.7166 7500 0.9534
0.9545 0.7643 8000 0.9501
0.9457 0.8121 8500 0.9486
0.9437 0.8599 9000 0.9431
0.9444 0.9077 9500 0.9435
0.9429 0.9554 10000 0.9369
0.9386 1.0032 10500 0.9370
0.9348 1.0509 11000 0.9306
0.9282 1.0987 11500 0.9275
0.9263 1.1465 12000 0.9266
0.9235 1.1942 12500 0.9250
0.9192 1.2420 13000 0.9229
0.9208 1.2898 13500 0.9188
0.9186 1.3376 14000 0.9190
0.9195 1.3853 14500 0.9158
0.9095 1.4331 15000 0.9156
0.9135 1.4809 15500 0.9105
0.9095 1.5286 16000 0.9097
0.9045 1.5764 16500 0.9102
0.9130 1.6242 17000 0.9090
0.9057 1.6720 17500 0.9057
0.8996 1.7197 18000 0.9055
0.9005 1.7675 18500 0.9052
0.8959 1.8153 19000 0.9007
0.9017 1.8630 19500 0.8989
0.8990 1.9108 20000 0.9000
0.8935 1.9586 20500 0.8947
0.9007 2.0063 21000 0.8931
0.8921 2.0541 21500 0.8922
0.8845 2.1018 22000 0.8933
0.8859 2.1496 22500 0.8931
0.8802 2.1974 23000 0.8922
0.8847 2.2452 23500 0.8933
0.8841 2.2929 24000 0.8895
0.8844 2.3407 24500 0.8878
0.8920 2.3885 25000 0.8901
0.8806 2.4362 25500 0.8876
0.8761 2.4840 26000 0.8862
0.8860 2.5318 26500 0.8873
0.8819 2.5796 27000 0.8883
0.8732 2.6273 27500 0.8865
0.8787 2.6751 28000 0.8857
0.8831 2.7229 28500 0.8851
0.8773 2.7706 29000 0.8881
0.8761 2.8184 29500 0.8868
0.8747 2.8662 30000 0.8864
0.8809 2.9140 30500 0.8845
0.8857 2.9617 31000 0.8853
0.8795 3.0 31401 0.8875

Framework versions

  • Transformers 5.10.2
  • Pytorch 2.11.0+cu128
  • Datasets 5.0.0
  • Tokenizers 0.22.2
Downloads last month
82
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kdutia/cpr-modernBERT-C

Finetuned
(1350)
this model