| --- |
| library_name: transformers |
| base_model: IRIIS-RESEARCH/RoBERTa_Nepali_125M |
| tags: |
| - generated_from_trainer |
| metrics: |
| - accuracy |
| - precision |
| - recall |
| - f1 |
| model-index: |
| - name: nepali-gec-binary-detector |
| results: [] |
| --- |
| |
| <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| should probably proofread and complete it, then remove this comment. --> |
|
|
| # nepali-gec-binary-detector |
|
|
| This model is a fine-tuned version of [IRIIS-RESEARCH/RoBERTa_Nepali_125M](https://huggingface.co/IRIIS-RESEARCH/RoBERTa_Nepali_125M) on an unknown dataset. |
| It achieves the following results on the evaluation set: |
| - Loss: 0.0407 |
| - Accuracy: 0.9874 |
| - Precision: 0.9338 |
| - Recall: 0.8248 |
| - F1: 0.8759 |
| - Sentence Accuracy: 0.8828 |
|
|
| ## Model description |
|
|
| More information needed |
|
|
| ## Intended uses & limitations |
|
|
| More information needed |
|
|
| ## Training and evaluation data |
|
|
| More information needed |
|
|
| ## Training procedure |
|
|
| ### Training hyperparameters |
|
|
| The following hyperparameters were used during training: |
| - learning_rate: 2e-06 |
| - train_batch_size: 1024 |
| - eval_batch_size: 1024 |
| - seed: 42 |
| - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
| - lr_scheduler_type: linear |
| - lr_scheduler_warmup_steps: 500 |
| - num_epochs: 3 |
| - mixed_precision_training: Native AMP |
|
|
| ### Training results |
|
|
| | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 | Sentence Accuracy | |
| |:-------------:|:------:|:-----:|:---------------:|:--------:|:---------:|:------:|:------:|:-----------------:| |
| | 0.1917 | 0.0787 | 1000 | 0.1144 | 0.9668 | 0.8991 | 0.4344 | 0.5858 | 0.7347 | |
| | 0.1083 | 0.1574 | 2000 | 0.0958 | 0.9708 | 0.8986 | 0.5197 | 0.6586 | 0.7625 | |
| | 0.0927 | 0.2361 | 3000 | 0.0791 | 0.9745 | 0.8839 | 0.6084 | 0.7207 | 0.7772 | |
| | 0.0807 | 0.3149 | 4000 | 0.0686 | 0.9780 | 0.8904 | 0.6760 | 0.7685 | 0.8015 | |
| | 0.0726 | 0.3936 | 5000 | 0.0623 | 0.9800 | 0.8910 | 0.7187 | 0.7956 | 0.8179 | |
| | 0.0671 | 0.4723 | 6000 | 0.0588 | 0.9813 | 0.9001 | 0.7369 | 0.8104 | 0.8286 | |
| | 0.0637 | 0.5510 | 7000 | 0.0559 | 0.9823 | 0.9017 | 0.7551 | 0.8219 | 0.8366 | |
| | 0.0608 | 0.6297 | 8000 | 0.0542 | 0.9830 | 0.9128 | 0.7587 | 0.8287 | 0.8434 | |
| | 0.0588 | 0.7084 | 9000 | 0.0520 | 0.9836 | 0.9135 | 0.7705 | 0.8359 | 0.8489 | |
| | 0.0569 | 0.7872 | 10000 | 0.0507 | 0.9841 | 0.9191 | 0.7748 | 0.8408 | 0.8532 | |
| | 0.0552 | 0.8659 | 11000 | 0.0493 | 0.9845 | 0.9221 | 0.7793 | 0.8447 | 0.8568 | |
| | 0.0543 | 0.9446 | 12000 | 0.0483 | 0.9849 | 0.9226 | 0.7866 | 0.8492 | 0.8597 | |
| | 0.053 | 1.0233 | 13000 | 0.0479 | 0.9851 | 0.9242 | 0.7901 | 0.8519 | 0.8623 | |
| | 0.0517 | 1.1020 | 14000 | 0.0467 | 0.9854 | 0.9207 | 0.7990 | 0.8556 | 0.8644 | |
| | 0.0511 | 1.1807 | 15000 | 0.0461 | 0.9856 | 0.9251 | 0.7992 | 0.8575 | 0.8671 | |
| | 0.0504 | 1.2594 | 16000 | 0.0452 | 0.9858 | 0.9205 | 0.8082 | 0.8607 | 0.8683 | |
| | 0.0497 | 1.3382 | 17000 | 0.0448 | 0.9860 | 0.9244 | 0.8073 | 0.8619 | 0.8700 | |
| | 0.0492 | 1.4169 | 18000 | 0.0443 | 0.9862 | 0.9289 | 0.8058 | 0.8630 | 0.8718 | |
| | 0.0485 | 1.4956 | 19000 | 0.0438 | 0.9863 | 0.9284 | 0.8098 | 0.8651 | 0.8731 | |
| | 0.0483 | 1.5743 | 20000 | 0.0436 | 0.9864 | 0.9292 | 0.8111 | 0.8662 | 0.8743 | |
| | 0.0477 | 1.6530 | 21000 | 0.0428 | 0.9866 | 0.9272 | 0.8156 | 0.8678 | 0.8751 | |
| | 0.0476 | 1.7317 | 22000 | 0.0431 | 0.9866 | 0.9333 | 0.8109 | 0.8678 | 0.8764 | |
| | 0.047 | 1.8105 | 23000 | 0.0426 | 0.9867 | 0.9311 | 0.8154 | 0.8694 | 0.8771 | |
| | 0.0466 | 1.8892 | 24000 | 0.0424 | 0.9868 | 0.9338 | 0.8144 | 0.8700 | 0.8782 | |
| | 0.0463 | 1.9679 | 25000 | 0.0421 | 0.9869 | 0.9326 | 0.8172 | 0.8711 | 0.8788 | |
| | 0.0459 | 2.0466 | 26000 | 0.0420 | 0.9870 | 0.9333 | 0.8178 | 0.8718 | 0.8794 | |
| | 0.0459 | 2.1253 | 27000 | 0.0416 | 0.9871 | 0.9308 | 0.8218 | 0.8729 | 0.8800 | |
| | 0.0455 | 2.2040 | 28000 | 0.0414 | 0.9871 | 0.9314 | 0.8223 | 0.8735 | 0.8803 | |
| | 0.0453 | 2.2827 | 29000 | 0.0414 | 0.9871 | 0.9327 | 0.8217 | 0.8737 | 0.8809 | |
| | 0.0452 | 2.3615 | 30000 | 0.0412 | 0.9872 | 0.9330 | 0.8223 | 0.8742 | 0.8813 | |
| | 0.045 | 2.4402 | 31000 | 0.0411 | 0.9872 | 0.9315 | 0.8245 | 0.8747 | 0.8815 | |
| | 0.045 | 2.5189 | 32000 | 0.0410 | 0.9873 | 0.9330 | 0.8237 | 0.8749 | 0.8819 | |
| | 0.0447 | 2.5976 | 33000 | 0.0409 | 0.9873 | 0.9328 | 0.8248 | 0.8755 | 0.8823 | |
| | 0.0448 | 2.6763 | 34000 | 0.0409 | 0.9873 | 0.9344 | 0.8234 | 0.8754 | 0.8825 | |
| | 0.0447 | 2.7550 | 35000 | 0.0407 | 0.9873 | 0.9330 | 0.8252 | 0.8758 | 0.8825 | |
| | 0.0445 | 2.8338 | 36000 | 0.0408 | 0.9873 | 0.9345 | 0.8237 | 0.8756 | 0.8827 | |
| | 0.0444 | 2.9125 | 37000 | 0.0407 | 0.9874 | 0.9336 | 0.8249 | 0.8759 | 0.8828 | |
| | 0.0446 | 2.9912 | 38000 | 0.0407 | 0.9874 | 0.9338 | 0.8248 | 0.8759 | 0.8828 | |
|
|
|
|
| ### Framework versions |
|
|
| - Transformers 4.57.1 |
| - Pytorch 2.8.0+cu128 |
| - Datasets 4.4.1 |
| - Tokenizers 0.22.1 |
|
|