halo-69
/

Bloom_3b_squad

Question Answering

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

halo-69 commited on Sep 25, 2023

Commit

8b69e8a

·

1 Parent(s): 8338b03

End of training

Files changed (2) hide show

README.md +15 -15
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b) on the squad dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.8279
 ## Model description
@@ -38,7 +38,7 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 48
-- eval_batch_size: 48
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -47,18 +47,18 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 313  | 2.7425          |
-| 2.7964        | 2.0   | 626  | 2.7329          |
-| 2.7964        | 3.0   | 939  | 2.7427          |
-| 2.6722        | 4.0   | 1252 | 2.7565          |
-| 2.6103        | 5.0   | 1565 | 2.7763          |
-| 2.6103        | 6.0   | 1878 | 2.7917          |
-| 2.5551        | 7.0   | 2191 | 2.8087          |
-| 2.5208        | 8.0   | 2504 | 2.8161          |
-| 2.5208        | 9.0   | 2817 | 2.8245          |
-| 2.4993        | 10.0  | 3130 | 2.8279          |
 ### Framework versions
@@ -66,4 +66,4 @@ The following hyperparameters were used during training:
 - Transformers 4.34.0.dev0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.5
-- Tokenizers 0.13.3

 This model is a fine-tuned version of [bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b) on the squad dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.7859
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
 - train_batch_size: 48
+- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 3.0058        | 1.0   | 1643  | 2.7510          |
+| 2.7801        | 2.0   | 3286  | 2.7497          |
+| 2.7284        | 3.0   | 4929  | 2.7536          |
+| 2.7001        | 4.0   | 6572  | 2.7601          |
+| 2.6811        | 5.0   | 8215  | 2.7669          |
+| 2.6811        | 6.0   | 9858  | 2.7722          |
+| 2.6639        | 7.0   | 11501 | 2.7780          |
+| 2.6492        | 8.0   | 13144 | 2.7817          |
+| 2.6414        | 9.0   | 14787 | 2.7841          |
+| 2.6354        | 10.0  | 16430 | 2.7859          |
 ### Framework versions
 - Transformers 4.34.0.dev0
 - Pytorch 2.0.1+cu118
 - Datasets 2.14.5
+- Tokenizers 0.14.0

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:17b1e3a799188c48c823654c2a234712c88e58a13e6fa9d9f247fc62a0c20edb
 size 19683045

 version https://git-lfs.github.com/spec/v1
+oid sha256:9e9fc51d94ec3f1234324ab816746020a3e5fe81b3c9a60be474d98163590e26
 size 19683045