yale-cultural-heritage
/

name-parser-small

text2text-generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

wjbmattingly commited on Jun 3, 2025

Commit

0ea2f4b

·

verified ·

1 Parent(s): 1e0522e

End of training

Files changed (1) hide show

README.md +16 -16

README.md CHANGED Viewed

@@ -18,8 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0114
-- Accuracy: 0.9951
 ## Model description
@@ -39,11 +39,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 32
 - eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 128
 - optimizer: Use adafactor and the args are:
 No additional optimizer arguments
 - lr_scheduler_type: linear
@@ -52,18 +52,18 @@ No additional optimizer arguments
 ### Training results
-| Training Loss | Epoch   | Step  | Validation Loss | Accuracy |
-|:-------------:|:-------:|:-----:|:---------------:|:--------:|
-| 0.0284        | 1.4297  | 2000  | 0.0187          | 0.9929   |
-| 0.0209        | 2.8593  | 4000  | 0.0149          | 0.9940   |
-| 0.0184        | 4.2888  | 6000  | 0.0137          | 0.9943   |
-| 0.0173        | 5.7185  | 8000  | 0.0127          | 0.9946   |
-| 0.0166        | 7.1480  | 10000 | 0.0123          | 0.9948   |
-| 0.0159        | 8.5777  | 12000 | 0.0119          | 0.9948   |
-| 0.0155        | 10.0071 | 14000 | 0.0117          | 0.9949   |
-| 0.015         | 11.4368 | 16000 | 0.0116          | 0.9949   |
-| 0.0149        | 12.8665 | 18000 | 0.0115          | 0.9950   |
-| 0.0148        | 14.2960 | 20000 | 0.0114          | 0.9951   |
 ### Framework versions

 This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0124
+- Accuracy: 0.9947
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 16
 - eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 64
 - optimizer: Use adafactor and the args are:
 No additional optimizer arguments
 - lr_scheduler_type: linear
 ### Training results
+| Training Loss | Epoch  | Step  | Validation Loss | Accuracy |
+|:-------------:|:------:|:-----:|:---------------:|:--------:|
+| 0.032         | 0.7150 | 2000  | 0.0202          | 0.9922   |
+| 0.0234        | 1.4297 | 4000  | 0.0164          | 0.9936   |
+| 0.0203        | 2.1444 | 6000  | 0.0146          | 0.9941   |
+| 0.0187        | 2.8594 | 8000  | 0.0138          | 0.9942   |
+| 0.0178        | 3.5741 | 10000 | 0.0133          | 0.9943   |
+| 0.0173        | 4.2889 | 12000 | 0.0130          | 0.9944   |
+| 0.0175        | 5.0036 | 14000 | 0.0126          | 0.9946   |
+| 0.0164        | 5.7186 | 16000 | 0.0124          | 0.9947   |
+| 0.0165        | 6.4333 | 18000 | 0.0124          | 0.9947   |
+| 0.0166        | 7.1480 | 20000 | 0.0124          | 0.9947   |
 ### Framework versions