Beck90
/

mt5-amharic-antonym

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 12.0311
 ## Model description
@@ -41,27 +41,102 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 15
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 47.918        | 1.0   | 60   | 38.3330         |
-| 40.2943       | 2.0   | 120  | 27.6096         |
-| 34.4622       | 3.0   | 180  | 22.3320         |
-| 32.0208       | 4.0   | 240  | 19.9857         |
-| 29.1126       | 5.0   | 300  | 17.8354         |
-| 26.4962       | 6.0   | 360  | 16.1180         |
-| 24.9211       | 7.0   | 420  | 15.0172         |
-| 23.5472       | 8.0   | 480  | 14.3574         |
-| 22.2333       | 9.0   | 540  | 13.6512         |
-| 20.9453       | 10.0  | 600  | 13.2603         |
-| 20.1657       | 11.0  | 660  | 12.9213         |
-| 19.6012       | 12.0  | 720  | 12.5839         |
-| 19.6056       | 13.0  | 780  | 12.2716         |
-| 19.2069       | 14.0  | 840  | 12.0760         |
-| 18.8685       | 15.0  | 900  | 12.0311         |
 ### Framework versions

 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5822
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 100
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 48.1602       | 1.0   | 60   | 39.0911         |
+| 39.8397       | 2.0   | 120  | 28.2010         |
+| 33.5322       | 3.0   | 180  | 22.2332         |
+| 29.7329       | 4.0   | 240  | 18.5160         |
+| 26.4591       | 5.0   | 300  | 16.6270         |
+| 23.7199       | 6.0   | 360  | 14.9033         |
+| 21.4338       | 7.0   | 420  | 13.3966         |
+| 19.4494       | 8.0   | 480  | 12.1507         |
+| 17.5813       | 9.0   | 540  | 11.0042         |
+| 15.385        | 10.0  | 600  | 9.7896          |
+| 14.1214       | 11.0  | 660  | 8.7964          |
+| 12.824        | 12.0  | 720  | 7.9328          |
+| 11.4696       | 13.0  | 780  | 7.1627          |
+| 10.2074       | 14.0  | 840  | 6.4906          |
+| 9.0975        | 15.0  | 900  | 5.9028          |
+| 8.3801        | 16.0  | 960  | 5.3529          |
+| 7.7496        | 17.0  | 1020 | 4.9146          |
+| 6.7216        | 18.0  | 1080 | 4.4308          |
+| 6.2098        | 19.0  | 1140 | 3.9412          |
+| 5.4955        | 20.0  | 1200 | 3.4992          |
+| 5.0123        | 21.0  | 1260 | 3.1167          |
+| 4.3249        | 22.0  | 1320 | 2.7577          |
+| 3.7474        | 23.0  | 1380 | 2.3593          |
+| 3.3065        | 24.0  | 1440 | 2.0139          |
+| 2.901         | 25.0  | 1500 | 1.6806          |
+| 2.4941        | 26.0  | 1560 | 1.5083          |
+| 2.2495        | 27.0  | 1620 | 1.3519          |
+| 2.1328        | 28.0  | 1680 | 1.2387          |
+| 1.9432        | 29.0  | 1740 | 1.1626          |
+| 1.7885        | 30.0  | 1800 | 1.0925          |
+| 1.6632        | 31.0  | 1860 | 1.0438          |
+| 1.5964        | 32.0  | 1920 | 1.0213          |
+| 1.4927        | 33.0  | 1980 | 0.9974          |
+| 1.443         | 34.0  | 2040 | 0.9755          |
+| 1.4459        | 35.0  | 2100 | 0.9626          |
+| 1.4127        | 36.0  | 2160 | 0.9419          |
+| 1.3008        | 37.0  | 2220 | 0.9232          |
+| 1.3198        | 38.0  | 2280 | 0.9001          |
+| 1.2208        | 39.0  | 2340 | 0.8826          |
+| 1.2165        | 40.0  | 2400 | 0.8694          |
+| 1.2188        | 41.0  | 2460 | 0.8589          |
+| 1.1627        | 42.0  | 2520 | 0.8427          |
+| 1.155         | 43.0  | 2580 | 0.8290          |
+| 1.069         | 44.0  | 2640 | 0.8145          |
+| 1.0762        | 45.0  | 2700 | 0.8038          |
+| 1.0239        | 46.0  | 2760 | 0.7905          |
+| 1.0317        | 47.0  | 2820 | 0.7829          |
+| 1.0047        | 48.0  | 2880 | 0.7727          |
+| 0.9471        | 49.0  | 2940 | 0.7634          |
+| 0.9366        | 50.0  | 3000 | 0.7542          |
+| 0.9635        | 51.0  | 3060 | 0.7464          |
+| 0.8958        | 52.0  | 3120 | 0.7378          |
+| 0.9107        | 53.0  | 3180 | 0.7269          |
+| 0.8582        | 54.0  | 3240 | 0.7168          |
+| 0.8749        | 55.0  | 3300 | 0.7082          |
+| 0.8661        | 56.0  | 3360 | 0.6979          |
+| 0.838         | 57.0  | 3420 | 0.6865          |
+| 0.8453        | 58.0  | 3480 | 0.6786          |
+| 0.8125        | 59.0  | 3540 | 0.6671          |
+| 0.8392        | 60.0  | 3600 | 0.6605          |
+| 0.8039        | 61.0  | 3660 | 0.6539          |
+| 0.7836        | 62.0  | 3720 | 0.6474          |
+| 0.8213        | 63.0  | 3780 | 0.6412          |
+| 0.8254        | 64.0  | 3840 | 0.6374          |
+| 0.817         | 65.0  | 3900 | 0.6325          |
+| 0.8193        | 66.0  | 3960 | 0.6288          |
+| 0.7962        | 67.0  | 4020 | 0.6231          |
+| 0.7844        | 68.0  | 4080 | 0.6178          |
+| 0.7597        | 69.0  | 4140 | 0.6146          |
+| 0.7924        | 70.0  | 4200 | 0.6108          |
+| 0.7812        | 71.0  | 4260 | 0.6064          |
+| 0.7714        | 72.0  | 4320 | 0.6042          |
+| 0.8165        | 73.0  | 4380 | 0.6029          |
+| 0.7469        | 74.0  | 4440 | 0.6007          |
+| 0.7636        | 75.0  | 4500 | 0.5988          |
+| 0.7597        | 76.0  | 4560 | 0.5976          |
+| 0.7227        | 77.0  | 4620 | 0.5944          |
+| 0.7816        | 78.0  | 4680 | 0.5924          |
+| 0.7509        | 79.0  | 4740 | 0.5914          |
+| 0.7563        | 80.0  | 4800 | 0.5903          |
+| 0.7658        | 81.0  | 4860 | 0.5879          |
+| 0.7438        | 82.0  | 4920 | 0.5867          |
+| 0.7357        | 83.0  | 4980 | 0.5858          |
+| 0.7325        | 84.0  | 5040 | 0.5846          |
+| 0.741         | 85.0  | 5100 | 0.5838          |
+| 0.7294        | 86.0  | 5160 | 0.5833          |
+| 0.7199        | 87.0  | 5220 | 0.5826          |
+| 0.7642        | 88.0  | 5280 | 0.5820          |
+| 0.7459        | 89.0  | 5340 | 0.5821          |
+| 0.7152        | 90.0  | 5400 | 0.5822          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4e147d61c983754acd18b9f824656fb4e745f4a593220d37bf3b579682846083
 size 1200729512

 version https://git-lfs.github.com/spec/v1
+oid sha256:a8562bfea4ff823fa1f6c94982164b6756ac843c9c889a0bf0077b2f0cabf1c9
 size 1200729512