dklpp commited on
Commit
2171d06
·
verified ·
1 Parent(s): b4ffdf1

End of training

Browse files
Files changed (1) hide show
  1. README.md +11 -17
README.md CHANGED
@@ -6,11 +6,6 @@ tags:
6
  - base_model:adapter:meta-llama/Llama-3.1-8B-Instruct
7
  - lora
8
  - transformers
9
- metrics:
10
- - accuracy
11
- - precision
12
- - recall
13
- - f1
14
  model-index:
15
  - name: llama3_ft_section_classifier
16
  results: []
@@ -23,11 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
23
 
24
  This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the None dataset.
25
  It achieves the following results on the evaluation set:
26
- - Loss: 985.5
27
- - Accuracy: 0.3939
28
- - Precision: 0.3631
29
- - Recall: 0.3939
30
- - F1: 0.3481
31
 
32
  ## Model description
33
 
@@ -46,21 +37,24 @@ More information needed
46
  ### Training hyperparameters
47
 
48
  The following hyperparameters were used during training:
49
- - learning_rate: 2e-05
50
  - train_batch_size: 4
51
  - eval_batch_size: 4
52
  - seed: 42
 
 
53
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
54
- - lr_scheduler_type: linear
 
55
  - num_epochs: 3
56
 
57
  ### Training results
58
 
59
- | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
60
- |:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
61
- | No log | 1.0 | 66 | 2146.0 | 0.3030 | 0.2749 | 0.3030 | 0.2411 |
62
- | No log | 2.0 | 132 | 1056.0 | 0.3788 | 0.3535 | 0.3788 | 0.3451 |
63
- | No log | 3.0 | 198 | 985.5 | 0.3939 | 0.3631 | 0.3939 | 0.3481 |
64
 
65
 
66
  ### Framework versions
 
6
  - base_model:adapter:meta-llama/Llama-3.1-8B-Instruct
7
  - lora
8
  - transformers
 
 
 
 
 
9
  model-index:
10
  - name: llama3_ft_section_classifier
11
  results: []
 
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 2.4151
 
 
 
 
22
 
23
  ## Model description
24
 
 
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
+ - learning_rate: 0.0002
41
  - train_batch_size: 4
42
  - eval_batch_size: 4
43
  - seed: 42
44
+ - gradient_accumulation_steps: 8
45
+ - total_train_batch_size: 32
46
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
+ - lr_scheduler_type: cosine
48
+ - lr_scheduler_warmup_ratio: 0.1
49
  - num_epochs: 3
50
 
51
  ### Training results
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss |
54
+ |:-------------:|:-----:|:----:|:---------------:|
55
+ | No log | 1.0 | 10 | 3.2325 |
56
+ | No log | 2.0 | 20 | 2.4390 |
57
+ | No log | 3.0 | 30 | 2.4151 |
58
 
59
 
60
  ### Framework versions