End of training

Browse files

Files changed (3) hide show

README.md +23 -23
model.safetensors +1 -1
runs/Jan23_16-34-12_ultramarine/events.out.tfevents.1737639253.ultramarine.3988838.0 +2 -2

README.md CHANGED Viewed

@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.8071
-- F1: 0.7980
-- Accuracy: 0.7959
 ## Model description
@@ -47,7 +47,7 @@ The following hyperparameters were used during training:
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 256
 - optimizer: Use adamw_torch with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 500
 - num_epochs: 20
@@ -55,25 +55,25 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch   | Step | Validation Loss | F1     | Accuracy |
 |:-------------:|:-------:|:----:|:---------------:|:------:|:--------:|
-| 3.981         | 1.0     | 19   | 1.9665          | 0.0904 | 0.1633   |
-| 3.6781        | 2.0     | 38   | 1.8383          | 0.1882 | 0.2449   |
-| 3.1283        | 3.0     | 57   | 1.4232          | 0.4896 | 0.5061   |
-| 2.4044        | 4.0     | 76   | 1.2398          | 0.5488 | 0.5673   |
-| 1.9166        | 5.0     | 95   | 1.1468          | 0.5726 | 0.6041   |
-| 1.7122        | 6.0     | 114  | 1.0013          | 0.6649 | 0.6653   |
-| 1.4626        | 7.0     | 133  | 0.8954          | 0.7198 | 0.7224   |
-| 1.2173        | 8.0     | 152  | 0.7306          | 0.7611 | 0.7592   |
-| 1.0648        | 9.0     | 171  | 0.7449          | 0.7412 | 0.7388   |
-| 0.9008        | 10.0    | 190  | 0.6874          | 0.7754 | 0.7714   |
-| 0.856         | 11.0    | 209  | 0.6584          | 0.8071 | 0.8082   |
-| 0.7557        | 12.0    | 228  | 0.6046          | 0.7854 | 0.7837   |
-| 0.472         | 13.0    | 247  | 0.8246          | 0.7428 | 0.7429   |
-| 0.4386        | 14.0    | 266  | 0.7892          | 0.8042 | 0.8082   |
-| 0.3418        | 15.0    | 285  | 0.6727          | 0.8248 | 0.8286   |
-| 0.2662        | 16.0    | 304  | 0.8244          | 0.8144 | 0.8163   |
-| 0.1774        | 17.0    | 323  | 0.7832          | 0.8083 | 0.8041   |
-| 0.1246        | 18.0    | 342  | 0.5501          | 0.8703 | 0.8694   |
-| 0.121         | 18.9730 | 360  | 0.8071          | 0.7980 | 0.7959   |
 ### Framework versions

 This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.8739
+- F1: 0.8061
+- Accuracy: 0.8082
 ## Model description
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 256
 - optimizer: Use adamw_torch with betas=(0.9,0.98) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 500
 - num_epochs: 20
 | Training Loss | Epoch   | Step | Validation Loss | F1     | Accuracy |
 |:-------------:|:-------:|:----:|:---------------:|:------:|:--------:|
+| 4.0021        | 1.0     | 19   | 1.9146          | 0.0636 | 0.1469   |
+| 3.6752        | 2.0     | 38   | 1.7785          | 0.3022 | 0.3388   |
+| 3.2521        | 3.0     | 57   | 1.4559          | 0.4311 | 0.4735   |
+| 2.6907        | 4.0     | 76   | 1.1927          | 0.5475 | 0.5714   |
+| 2.2003        | 5.0     | 95   | 0.9852          | 0.6614 | 0.6571   |
+| 1.7928        | 6.0     | 114  | 0.8017          | 0.7147 | 0.7102   |
+| 1.4909        | 7.0     | 133  | 0.8603          | 0.7070 | 0.7020   |
+| 1.3136        | 8.0     | 152  | 0.6970          | 0.7395 | 0.7429   |
+| 1.1483        | 9.0     | 171  | 0.5679          | 0.7774 | 0.7755   |
+| 0.903         | 10.0    | 190  | 0.9122          | 0.7078 | 0.7061   |
+| 0.886         | 11.0    | 209  | 0.6270          | 0.7707 | 0.7755   |
+| 0.7609        | 12.0    | 228  | 0.6756          | 0.8038 | 0.8082   |
+| 0.6929        | 13.0    | 247  | 0.5790          | 0.8290 | 0.8327   |
+| 0.4927        | 14.0    | 266  | 0.7072          | 0.8067 | 0.8082   |
+| 0.3282        | 15.0    | 285  | 0.6293          | 0.8490 | 0.8490   |
+| 0.2706        | 16.0    | 304  | 0.8920          | 0.7867 | 0.7878   |
+| 0.2311        | 17.0    | 323  | 0.7759          | 0.8466 | 0.8490   |
+| 0.1268        | 18.0    | 342  | 0.7496          | 0.8324 | 0.8327   |
+| 0.1276        | 18.9730 | 360  | 0.8739          | 0.8061 | 0.8082   |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:57620871b170ae43ae6dd37db41e5a0e9ccaf10d1dfc8fb7385200145c059845
 size 598455164

 version https://git-lfs.github.com/spec/v1
+oid sha256:c11cc46db2f4c6ed97696f455ad98b49a4a1f981e0d7dfc5556a7f59c19f5a6f
 size 598455164

runs/Jan23_16-34-12_ultramarine/events.out.tfevents.1737639253.ultramarine.3988838.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:12224b2c9a7e4bfb1f3d0fc4ef1966747ded66a42c9c8799b3f0b71d29262aec
-size 16305

 version https://git-lfs.github.com/spec/v1
+oid sha256:a4268d132d189f868b1df1325e68a3aa14be8ab45e755caa6e0d45fa4a138f63
+size 17239