garypak commited on
Commit
10d5b4c
·
verified ·
1 Parent(s): 635ab58

End of training

Browse files
Files changed (4) hide show
  1. README.md +23 -11
  2. pytorch_model.bin +1 -1
  3. tokenizer.json +1 -1
  4. training_args.bin +1 -1
README.md CHANGED
@@ -4,6 +4,8 @@ license: llama3.2
4
  base_model: meta-llama/Llama-3.2-1B
5
  tags:
6
  - generated_from_trainer
 
 
7
  model-index:
8
  - name: llama3.2-1b-rumour-samples
9
  results: []
@@ -16,13 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - eval_loss: 0.7789
20
- - eval_accuracy: 0.6458
21
- - eval_runtime: 112.7901
22
- - eval_samples_per_second: 5.532
23
- - eval_steps_per_second: 1.383
24
- - epoch: 5.9947
25
- - step: 846
26
 
27
  ## Model description
28
 
@@ -42,16 +39,31 @@ More information needed
42
 
43
  The following hyperparameters were used during training:
44
  - learning_rate: 2e-05
45
- - train_batch_size: 4
46
- - eval_batch_size: 4
47
  - seed: 666
48
  - gradient_accumulation_steps: 4
49
- - total_train_batch_size: 16
50
  - optimizer: Use adamw_hf with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
51
  - lr_scheduler_type: linear
52
- - num_epochs: 6
 
53
  - mixed_precision_training: Native AMP
54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  ### Framework versions
56
 
57
  - Transformers 4.47.0.dev0
 
4
  base_model: meta-llama/Llama-3.2-1B
5
  tags:
6
  - generated_from_trainer
7
+ metrics:
8
+ - accuracy
9
  model-index:
10
  - name: llama3.2-1b-rumour-samples
11
  results: []
 
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 2.5479
22
+ - Accuracy: 0.5545
 
 
 
 
 
23
 
24
  ## Model description
25
 
 
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 2e-05
42
+ - train_batch_size: 2
43
+ - eval_batch_size: 2
44
  - seed: 666
45
  - gradient_accumulation_steps: 4
46
+ - total_train_batch_size: 8
47
  - optimizer: Use adamw_hf with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
+ - lr_scheduler_warmup_ratio: 0.1
50
+ - num_epochs: 8
51
  - mixed_precision_training: Native AMP
52
 
53
+ ### Training results
54
+
55
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
56
+ |:-------------:|:------:|:----:|:---------------:|:--------:|
57
+ | 2.5392 | 0.9975 | 298 | 2.2526 | 0.2644 |
58
+ | 1.2697 | 1.9975 | 596 | 1.8002 | 0.5545 |
59
+ | 1.5248 | 2.9975 | 894 | 2.5925 | 0.4375 |
60
+ | 0.4173 | 3.9975 | 1192 | 2.3996 | 0.5337 |
61
+ | 0.4028 | 4.9975 | 1490 | 2.4014 | 0.5369 |
62
+ | 0.1915 | 5.9975 | 1788 | 2.5256 | 0.5465 |
63
+ | 0.1105 | 6.9975 | 2086 | 2.5003 | 0.5497 |
64
+ | 0.0159 | 7.9975 | 2384 | 2.5479 | 0.5545 |
65
+
66
+
67
  ### Framework versions
68
 
69
  - Transformers 4.47.0.dev0
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0a1a743b669e0c3a94556a1e9f2ec7cc2e9c70fc31e75e485562fdc557de753a
3
  size 2037172570
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6fe3ed6482106b6416cfd621de39cf2e82d4d6222a13045a3513442d3fd0cc08
3
  size 2037172570
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2e148d3e6c6fcb0795ca56bfc18350336fe896322130403009e94a32a3bb1dd1
3
  size 17210372
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:831042cfb3bc13c9bdea5594375c05d54649a6981310d67b62a841fef0e18af0
3
  size 17210372
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:541c439070f06a357dd7e40bfc1253d89c1bd5b23ce335b6b18f565deec4e0dc
3
  size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c25a361e239b5081cebeeb1ab78af742e2012312db404b4e17cdbe0182505e01
3
  size 5304