End of training

Browse files

Files changed (4) hide show

README.md +23 -11
pytorch_model.bin +1 -1
tokenizer.json +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -4,6 +4,8 @@ license: llama3.2
 base_model: meta-llama/Llama-3.2-1B
 tags:
 - generated_from_trainer
 model-index:
 - name: llama3.2-1b-rumour-samples
   results: []
@@ -16,13 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- eval_loss: 0.7789
-- eval_accuracy: 0.6458
-- eval_runtime: 112.7901
-- eval_samples_per_second: 5.532
-- eval_steps_per_second: 1.383
-- epoch: 5.9947
-- step: 846
 ## Model description
@@ -42,16 +39,31 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 666
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 16
 - optimizer: Use adamw_hf with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 6
 - mixed_precision_training: Native AMP
 ### Framework versions
 - Transformers 4.47.0.dev0

 base_model: meta-llama/Llama-3.2-1B
 tags:
 - generated_from_trainer
+metrics:
+- accuracy
 model-index:
 - name: llama3.2-1b-rumour-samples
   results: []
 This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.5479
+- Accuracy: 0.5545
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 2
+- eval_batch_size: 2
 - seed: 666
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 8
 - optimizer: Use adamw_hf with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 8
 - mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Accuracy |
+|:-------------:|:------:|:----:|:---------------:|:--------:|
+| 2.5392        | 0.9975 | 298  | 2.2526          | 0.2644   |
+| 1.2697        | 1.9975 | 596  | 1.8002          | 0.5545   |
+| 1.5248        | 2.9975 | 894  | 2.5925          | 0.4375   |
+| 0.4173        | 3.9975 | 1192 | 2.3996          | 0.5337   |
+| 0.4028        | 4.9975 | 1490 | 2.4014          | 0.5369   |
+| 0.1915        | 5.9975 | 1788 | 2.5256          | 0.5465   |
+| 0.1105        | 6.9975 | 2086 | 2.5003          | 0.5497   |
+| 0.0159        | 7.9975 | 2384 | 2.5479          | 0.5545   |
 ### Framework versions
 - Transformers 4.47.0.dev0

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0a1a743b669e0c3a94556a1e9f2ec7cc2e9c70fc31e75e485562fdc557de753a
 size 2037172570

 version https://git-lfs.github.com/spec/v1
+oid sha256:6fe3ed6482106b6416cfd621de39cf2e82d4d6222a13045a3513442d3fd0cc08
 size 2037172570

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2e148d3e6c6fcb0795ca56bfc18350336fe896322130403009e94a32a3bb1dd1
 size 17210372

 version https://git-lfs.github.com/spec/v1
+oid sha256:831042cfb3bc13c9bdea5594375c05d54649a6981310d67b62a841fef0e18af0
 size 17210372

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:541c439070f06a357dd7e40bfc1253d89c1bd5b23ce335b6b18f565deec4e0dc
 size 5304

 version https://git-lfs.github.com/spec/v1
+oid sha256:c25a361e239b5081cebeeb1ab78af742e2012312db404b4e17cdbe0182505e01
 size 5304