DipeshChaudhary commited on
Commit
bdb89e9
·
verified ·
1 Parent(s): d28abab

Model save

Browse files
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ base_model: IRIIS-RESEARCH/RoBERTa_Nepali_125M
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - accuracy
8
+ - precision
9
+ - recall
10
+ - f1
11
+ model-index:
12
+ - name: nepali-gec-binary-detector
13
+ results: []
14
+ ---
15
+
16
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
+ should probably proofread and complete it, then remove this comment. -->
18
+
19
+ # nepali-gec-binary-detector
20
+
21
+ This model is a fine-tuned version of [IRIIS-RESEARCH/RoBERTa_Nepali_125M](https://huggingface.co/IRIIS-RESEARCH/RoBERTa_Nepali_125M) on an unknown dataset.
22
+ It achieves the following results on the evaluation set:
23
+ - Loss: 0.0407
24
+ - Accuracy: 0.9874
25
+ - Precision: 0.9338
26
+ - Recall: 0.8248
27
+ - F1: 0.8759
28
+ - Sentence Accuracy: 0.8828
29
+
30
+ ## Model description
31
+
32
+ More information needed
33
+
34
+ ## Intended uses & limitations
35
+
36
+ More information needed
37
+
38
+ ## Training and evaluation data
39
+
40
+ More information needed
41
+
42
+ ## Training procedure
43
+
44
+ ### Training hyperparameters
45
+
46
+ The following hyperparameters were used during training:
47
+ - learning_rate: 2e-06
48
+ - train_batch_size: 1024
49
+ - eval_batch_size: 1024
50
+ - seed: 42
51
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
+ - lr_scheduler_type: linear
53
+ - lr_scheduler_warmup_steps: 500
54
+ - num_epochs: 3
55
+ - mixed_precision_training: Native AMP
56
+
57
+ ### Training results
58
+
59
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 | Sentence Accuracy |
60
+ |:-------------:|:------:|:-----:|:---------------:|:--------:|:---------:|:------:|:------:|:-----------------:|
61
+ | 0.1917 | 0.0787 | 1000 | 0.1144 | 0.9668 | 0.8991 | 0.4344 | 0.5858 | 0.7347 |
62
+ | 0.1083 | 0.1574 | 2000 | 0.0958 | 0.9708 | 0.8986 | 0.5197 | 0.6586 | 0.7625 |
63
+ | 0.0927 | 0.2361 | 3000 | 0.0791 | 0.9745 | 0.8839 | 0.6084 | 0.7207 | 0.7772 |
64
+ | 0.0807 | 0.3149 | 4000 | 0.0686 | 0.9780 | 0.8904 | 0.6760 | 0.7685 | 0.8015 |
65
+ | 0.0726 | 0.3936 | 5000 | 0.0623 | 0.9800 | 0.8910 | 0.7187 | 0.7956 | 0.8179 |
66
+ | 0.0671 | 0.4723 | 6000 | 0.0588 | 0.9813 | 0.9001 | 0.7369 | 0.8104 | 0.8286 |
67
+ | 0.0637 | 0.5510 | 7000 | 0.0559 | 0.9823 | 0.9017 | 0.7551 | 0.8219 | 0.8366 |
68
+ | 0.0608 | 0.6297 | 8000 | 0.0542 | 0.9830 | 0.9128 | 0.7587 | 0.8287 | 0.8434 |
69
+ | 0.0588 | 0.7084 | 9000 | 0.0520 | 0.9836 | 0.9135 | 0.7705 | 0.8359 | 0.8489 |
70
+ | 0.0569 | 0.7872 | 10000 | 0.0507 | 0.9841 | 0.9191 | 0.7748 | 0.8408 | 0.8532 |
71
+ | 0.0552 | 0.8659 | 11000 | 0.0493 | 0.9845 | 0.9221 | 0.7793 | 0.8447 | 0.8568 |
72
+ | 0.0543 | 0.9446 | 12000 | 0.0483 | 0.9849 | 0.9226 | 0.7866 | 0.8492 | 0.8597 |
73
+ | 0.053 | 1.0233 | 13000 | 0.0479 | 0.9851 | 0.9242 | 0.7901 | 0.8519 | 0.8623 |
74
+ | 0.0517 | 1.1020 | 14000 | 0.0467 | 0.9854 | 0.9207 | 0.7990 | 0.8556 | 0.8644 |
75
+ | 0.0511 | 1.1807 | 15000 | 0.0461 | 0.9856 | 0.9251 | 0.7992 | 0.8575 | 0.8671 |
76
+ | 0.0504 | 1.2594 | 16000 | 0.0452 | 0.9858 | 0.9205 | 0.8082 | 0.8607 | 0.8683 |
77
+ | 0.0497 | 1.3382 | 17000 | 0.0448 | 0.9860 | 0.9244 | 0.8073 | 0.8619 | 0.8700 |
78
+ | 0.0492 | 1.4169 | 18000 | 0.0443 | 0.9862 | 0.9289 | 0.8058 | 0.8630 | 0.8718 |
79
+ | 0.0485 | 1.4956 | 19000 | 0.0438 | 0.9863 | 0.9284 | 0.8098 | 0.8651 | 0.8731 |
80
+ | 0.0483 | 1.5743 | 20000 | 0.0436 | 0.9864 | 0.9292 | 0.8111 | 0.8662 | 0.8743 |
81
+ | 0.0477 | 1.6530 | 21000 | 0.0428 | 0.9866 | 0.9272 | 0.8156 | 0.8678 | 0.8751 |
82
+ | 0.0476 | 1.7317 | 22000 | 0.0431 | 0.9866 | 0.9333 | 0.8109 | 0.8678 | 0.8764 |
83
+ | 0.047 | 1.8105 | 23000 | 0.0426 | 0.9867 | 0.9311 | 0.8154 | 0.8694 | 0.8771 |
84
+ | 0.0466 | 1.8892 | 24000 | 0.0424 | 0.9868 | 0.9338 | 0.8144 | 0.8700 | 0.8782 |
85
+ | 0.0463 | 1.9679 | 25000 | 0.0421 | 0.9869 | 0.9326 | 0.8172 | 0.8711 | 0.8788 |
86
+ | 0.0459 | 2.0466 | 26000 | 0.0420 | 0.9870 | 0.9333 | 0.8178 | 0.8718 | 0.8794 |
87
+ | 0.0459 | 2.1253 | 27000 | 0.0416 | 0.9871 | 0.9308 | 0.8218 | 0.8729 | 0.8800 |
88
+ | 0.0455 | 2.2040 | 28000 | 0.0414 | 0.9871 | 0.9314 | 0.8223 | 0.8735 | 0.8803 |
89
+ | 0.0453 | 2.2827 | 29000 | 0.0414 | 0.9871 | 0.9327 | 0.8217 | 0.8737 | 0.8809 |
90
+ | 0.0452 | 2.3615 | 30000 | 0.0412 | 0.9872 | 0.9330 | 0.8223 | 0.8742 | 0.8813 |
91
+ | 0.045 | 2.4402 | 31000 | 0.0411 | 0.9872 | 0.9315 | 0.8245 | 0.8747 | 0.8815 |
92
+ | 0.045 | 2.5189 | 32000 | 0.0410 | 0.9873 | 0.9330 | 0.8237 | 0.8749 | 0.8819 |
93
+ | 0.0447 | 2.5976 | 33000 | 0.0409 | 0.9873 | 0.9328 | 0.8248 | 0.8755 | 0.8823 |
94
+ | 0.0448 | 2.6763 | 34000 | 0.0409 | 0.9873 | 0.9344 | 0.8234 | 0.8754 | 0.8825 |
95
+ | 0.0447 | 2.7550 | 35000 | 0.0407 | 0.9873 | 0.9330 | 0.8252 | 0.8758 | 0.8825 |
96
+ | 0.0445 | 2.8338 | 36000 | 0.0408 | 0.9873 | 0.9345 | 0.8237 | 0.8756 | 0.8827 |
97
+ | 0.0444 | 2.9125 | 37000 | 0.0407 | 0.9874 | 0.9336 | 0.8249 | 0.8759 | 0.8828 |
98
+ | 0.0446 | 2.9912 | 38000 | 0.0407 | 0.9874 | 0.9338 | 0.8248 | 0.8759 | 0.8828 |
99
+
100
+
101
+ ### Framework versions
102
+
103
+ - Transformers 4.57.1
104
+ - Pytorch 2.8.0+cu128
105
+ - Datasets 4.4.1
106
+ - Tokenizers 0.22.1
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5a57d0a6c1d786c59b306dde561397c2aee375c40c122f359d9e2054209ad25b
3
  size 496222584
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f85332b3fc2d59b6fa905f8a34d4efba0da0bc486827e3e3832d8c87ff59df95
3
  size 496222584
runs/Nov09_15-34-23_computeinstance-u00m8a65ysbpw7f9hb/events.out.tfevents.1762702471.computeinstance-u00m8a65ysbpw7f9hb.213884.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:57fab481fee27b24608ed4a1304d87888fbd9374feae5f46028b864c400ab7b6
3
- size 33806
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a09b30b2ba8e9ebba884ea4b521a40da3276b4a9cbd7cfb419bcfb19c25f19cf
3
+ size 34166