counterfactual_seed-42_1e-3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1863
  • Accuracy: 0.4007

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 32000
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
6.0073 0.9994 1486 4.4192 0.2920
4.3098 1.9992 2972 3.9141 0.3313
3.7057 2.9991 4458 3.6329 0.3552
3.5345 3.9996 5945 3.4651 0.3710
3.3114 4.9994 7431 3.3682 0.3802
3.2377 5.9992 8917 3.3098 0.3859
3.1306 6.9991 10403 3.2696 0.3896
3.0915 7.9996 11890 3.2479 0.3926
3.0307 8.9994 13376 3.2247 0.3945
3.0054 9.9992 14862 3.2130 0.3962
2.9679 10.9991 16348 3.2049 0.3972
2.9475 11.9996 17835 3.1986 0.3978
2.9256 12.9994 19321 3.1964 0.3988
2.9069 13.9992 20807 3.1900 0.3994
2.8976 14.9991 22293 3.1946 0.3994
2.879 15.9996 23780 3.1883 0.3999
2.8775 16.9994 25266 3.1918 0.4004
2.8601 17.9992 26752 3.1878 0.4006
2.8664 18.9991 28238 3.1885 0.4006
2.8445 19.9962 29720 3.1863 0.4007

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.20.0
Downloads last month
4
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support