FemkeBakker
/

AmsterdamDocClassificationGEITje200T2Epochs

@@ -8,6 +8,10 @@ tags:
 model-index:
 - name: AmsterdamDocClassificationGEITje200T2Epochs
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,24 +19,24 @@ should probably proofread and complete it, then remove this comment. -->
 # AmsterdamDocClassificationGEITje200T2Epochs
-This model is a fine-tuned version of [Rijgersberg/GEITje-7B-chat-v2](https://huggingface.co/Rijgersberg/GEITje-7B-chat-v2) on the [AmsterdamDocClassification](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.5796
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -62,10 +66,15 @@ The following hyperparameters were used during training:
 | 0.4699        | 1.7903 | 1107 | 0.5796          |
 | 0.5434        | 1.9891 | 1230 | 0.5796          |
 ### Framework versions
 - Transformers 4.41.1
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 model-index:
 - name: AmsterdamDocClassificationGEITje200T2Epochs
   results: []
+datasets:
+- FemkeBakker/AmsterdamBalancedFirst200Tokens
+language:
+- nl
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # AmsterdamDocClassificationGEITje200T2Epochs
+As part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam, we fine-tune Mistral, Llama, and GEITje for document classification.
+The fine-tuning is performed using the [AmsterdamBalancedFirst200Tokens](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset, which consists of documents truncated to the first 200 tokens.
+In our research, we evaluate the fine-tuning of these LLMs across one, two, and three epochs.
+This model is a fine-tuned version of [Rijgersberg/GEITje-7B-chat-v2](https://huggingface.co/Rijgersberg/GEITje-7B-chat-v2) and has been fine-tuned for two epochs.
+It achieves the following results on the evaluation set:
+- Loss: 0.5796
 ## Training and evaluation data
+- The training data consists of 9900 documents and their labels formatted into conversations.
+- The evaluation data consists of 1100 documents and their labels formatted into conversations.
 ## Training procedure
+See the [GitHub](https://github.com/Amsterdam-Internships/document-classification-using-large-language-models) for specifics about the training and the code.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | 0.4699        | 1.7903 | 1107 | 0.5796          |
 | 0.5434        | 1.9891 | 1230 | 0.5796          |
+Training time: it took in total 1 hour and 36 minutes to fine-tune the model for two epochs.
 ### Framework versions
 - Transformers 4.41.1
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1
+### Acknowledgements
+This model was trained as part of [insert thesis info] in collaboration with Amsterdam Intelligence for the City of Amsterdam.