Model save

bdb89e9 verified 6 months ago

6.07 kB

	---
	library_name: transformers
	base_model: IRIIS-RESEARCH/RoBERTa_Nepali_125M
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	model-index:
	- name: nepali-gec-binary-detector
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# nepali-gec-binary-detector

	This model is a fine-tuned version of [IRIIS-RESEARCH/RoBERTa_Nepali_125M](https://huggingface.co/IRIIS-RESEARCH/RoBERTa_Nepali_125M) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0407
	- Accuracy: 0.9874
	- Precision: 0.9338
	- Recall: 0.8248
	- F1: 0.8759
	- Sentence Accuracy: 0.8828

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-06
	- train_batch_size: 1024
	- eval_batch_size: 1024
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 3
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Precision \| Recall \| F1 \| Sentence Accuracy \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:--------:\|:---------:\|:------:\|:------:\|:-----------------:\|
	\| 0.1917 \| 0.0787 \| 1000 \| 0.1144 \| 0.9668 \| 0.8991 \| 0.4344 \| 0.5858 \| 0.7347 \|
	\| 0.1083 \| 0.1574 \| 2000 \| 0.0958 \| 0.9708 \| 0.8986 \| 0.5197 \| 0.6586 \| 0.7625 \|
	\| 0.0927 \| 0.2361 \| 3000 \| 0.0791 \| 0.9745 \| 0.8839 \| 0.6084 \| 0.7207 \| 0.7772 \|
	\| 0.0807 \| 0.3149 \| 4000 \| 0.0686 \| 0.9780 \| 0.8904 \| 0.6760 \| 0.7685 \| 0.8015 \|
	\| 0.0726 \| 0.3936 \| 5000 \| 0.0623 \| 0.9800 \| 0.8910 \| 0.7187 \| 0.7956 \| 0.8179 \|
	\| 0.0671 \| 0.4723 \| 6000 \| 0.0588 \| 0.9813 \| 0.9001 \| 0.7369 \| 0.8104 \| 0.8286 \|
	\| 0.0637 \| 0.5510 \| 7000 \| 0.0559 \| 0.9823 \| 0.9017 \| 0.7551 \| 0.8219 \| 0.8366 \|
	\| 0.0608 \| 0.6297 \| 8000 \| 0.0542 \| 0.9830 \| 0.9128 \| 0.7587 \| 0.8287 \| 0.8434 \|
	\| 0.0588 \| 0.7084 \| 9000 \| 0.0520 \| 0.9836 \| 0.9135 \| 0.7705 \| 0.8359 \| 0.8489 \|
	\| 0.0569 \| 0.7872 \| 10000 \| 0.0507 \| 0.9841 \| 0.9191 \| 0.7748 \| 0.8408 \| 0.8532 \|
	\| 0.0552 \| 0.8659 \| 11000 \| 0.0493 \| 0.9845 \| 0.9221 \| 0.7793 \| 0.8447 \| 0.8568 \|
	\| 0.0543 \| 0.9446 \| 12000 \| 0.0483 \| 0.9849 \| 0.9226 \| 0.7866 \| 0.8492 \| 0.8597 \|
	\| 0.053 \| 1.0233 \| 13000 \| 0.0479 \| 0.9851 \| 0.9242 \| 0.7901 \| 0.8519 \| 0.8623 \|
	\| 0.0517 \| 1.1020 \| 14000 \| 0.0467 \| 0.9854 \| 0.9207 \| 0.7990 \| 0.8556 \| 0.8644 \|
	\| 0.0511 \| 1.1807 \| 15000 \| 0.0461 \| 0.9856 \| 0.9251 \| 0.7992 \| 0.8575 \| 0.8671 \|
	\| 0.0504 \| 1.2594 \| 16000 \| 0.0452 \| 0.9858 \| 0.9205 \| 0.8082 \| 0.8607 \| 0.8683 \|
	\| 0.0497 \| 1.3382 \| 17000 \| 0.0448 \| 0.9860 \| 0.9244 \| 0.8073 \| 0.8619 \| 0.8700 \|
	\| 0.0492 \| 1.4169 \| 18000 \| 0.0443 \| 0.9862 \| 0.9289 \| 0.8058 \| 0.8630 \| 0.8718 \|
	\| 0.0485 \| 1.4956 \| 19000 \| 0.0438 \| 0.9863 \| 0.9284 \| 0.8098 \| 0.8651 \| 0.8731 \|
	\| 0.0483 \| 1.5743 \| 20000 \| 0.0436 \| 0.9864 \| 0.9292 \| 0.8111 \| 0.8662 \| 0.8743 \|
	\| 0.0477 \| 1.6530 \| 21000 \| 0.0428 \| 0.9866 \| 0.9272 \| 0.8156 \| 0.8678 \| 0.8751 \|
	\| 0.0476 \| 1.7317 \| 22000 \| 0.0431 \| 0.9866 \| 0.9333 \| 0.8109 \| 0.8678 \| 0.8764 \|
	\| 0.047 \| 1.8105 \| 23000 \| 0.0426 \| 0.9867 \| 0.9311 \| 0.8154 \| 0.8694 \| 0.8771 \|
	\| 0.0466 \| 1.8892 \| 24000 \| 0.0424 \| 0.9868 \| 0.9338 \| 0.8144 \| 0.8700 \| 0.8782 \|
	\| 0.0463 \| 1.9679 \| 25000 \| 0.0421 \| 0.9869 \| 0.9326 \| 0.8172 \| 0.8711 \| 0.8788 \|
	\| 0.0459 \| 2.0466 \| 26000 \| 0.0420 \| 0.9870 \| 0.9333 \| 0.8178 \| 0.8718 \| 0.8794 \|
	\| 0.0459 \| 2.1253 \| 27000 \| 0.0416 \| 0.9871 \| 0.9308 \| 0.8218 \| 0.8729 \| 0.8800 \|
	\| 0.0455 \| 2.2040 \| 28000 \| 0.0414 \| 0.9871 \| 0.9314 \| 0.8223 \| 0.8735 \| 0.8803 \|
	\| 0.0453 \| 2.2827 \| 29000 \| 0.0414 \| 0.9871 \| 0.9327 \| 0.8217 \| 0.8737 \| 0.8809 \|
	\| 0.0452 \| 2.3615 \| 30000 \| 0.0412 \| 0.9872 \| 0.9330 \| 0.8223 \| 0.8742 \| 0.8813 \|
	\| 0.045 \| 2.4402 \| 31000 \| 0.0411 \| 0.9872 \| 0.9315 \| 0.8245 \| 0.8747 \| 0.8815 \|
	\| 0.045 \| 2.5189 \| 32000 \| 0.0410 \| 0.9873 \| 0.9330 \| 0.8237 \| 0.8749 \| 0.8819 \|
	\| 0.0447 \| 2.5976 \| 33000 \| 0.0409 \| 0.9873 \| 0.9328 \| 0.8248 \| 0.8755 \| 0.8823 \|
	\| 0.0448 \| 2.6763 \| 34000 \| 0.0409 \| 0.9873 \| 0.9344 \| 0.8234 \| 0.8754 \| 0.8825 \|
	\| 0.0447 \| 2.7550 \| 35000 \| 0.0407 \| 0.9873 \| 0.9330 \| 0.8252 \| 0.8758 \| 0.8825 \|
	\| 0.0445 \| 2.8338 \| 36000 \| 0.0408 \| 0.9873 \| 0.9345 \| 0.8237 \| 0.8756 \| 0.8827 \|
	\| 0.0444 \| 2.9125 \| 37000 \| 0.0407 \| 0.9874 \| 0.9336 \| 0.8249 \| 0.8759 \| 0.8828 \|
	\| 0.0446 \| 2.9912 \| 38000 \| 0.0407 \| 0.9874 \| 0.9338 \| 0.8248 \| 0.8759 \| 0.8828 \|


	### Framework versions

	- Transformers 4.57.1
	- Pytorch 2.8.0+cu128
	- Datasets 4.4.1
	- Tokenizers 0.22.1