SystemAdmin123 commited on
Commit
30eea7c
·
verified ·
1 Parent(s): 89543b3

End of training

Browse files
Files changed (1) hide show
  1. README.md +5 -12
README.md CHANGED
@@ -37,7 +37,7 @@ datasets:
37
  system_prompt: ''
38
  device_map: auto
39
  eval_sample_packing: false
40
- eval_steps: 20
41
  flash_attention: true
42
  gradient_checkpointing: true
43
  group_by_length: true
@@ -55,7 +55,7 @@ output_dir: /root/.sn56/axolotl/tmp/opt-350m
55
  pad_to_sequence_len: true
56
  resize_token_embeddings_to_32x: false
57
  sample_packing: true
58
- save_steps: 20
59
  save_total_limit: 1
60
  sequence_len: 2048
61
  tokenizer_type: GPT2TokenizerFast
@@ -79,8 +79,6 @@ warmup_ratio: 0.05
79
  # opt-350m
80
 
81
  This model is a fine-tuned version of [facebook/opt-350m](https://huggingface.co/facebook/opt-350m) on the argilla/databricks-dolly-15k-curated-en dataset.
82
- It achieves the following results on the evaluation set:
83
- - Loss: 3.0371
84
 
85
  ## Model description
86
 
@@ -114,14 +112,9 @@ The following hyperparameters were used during training:
114
 
115
  ### Training results
116
 
117
- | Training Loss | Epoch | Step | Validation Loss |
118
- |:-------------:|:-------:|:----:|:---------------:|
119
- | No log | 0.1429 | 1 | 3.0789 |
120
- | 5.2701 | 2.8571 | 20 | 3.3286 |
121
- | 4.3713 | 5.7143 | 40 | 3.2812 |
122
- | 3.885 | 8.5714 | 60 | 3.0509 |
123
- | 3.7326 | 11.4286 | 80 | 3.0470 |
124
- | 3.6947 | 14.2857 | 100 | 3.0371 |
125
 
126
 
127
  ### Framework versions
 
37
  system_prompt: ''
38
  device_map: auto
39
  eval_sample_packing: false
40
+ eval_steps: 200
41
  flash_attention: true
42
  gradient_checkpointing: true
43
  group_by_length: true
 
55
  pad_to_sequence_len: true
56
  resize_token_embeddings_to_32x: false
57
  sample_packing: true
58
+ save_steps: 200
59
  save_total_limit: 1
60
  sequence_len: 2048
61
  tokenizer_type: GPT2TokenizerFast
 
79
  # opt-350m
80
 
81
  This model is a fine-tuned version of [facebook/opt-350m](https://huggingface.co/facebook/opt-350m) on the argilla/databricks-dolly-15k-curated-en dataset.
 
 
82
 
83
  ## Model description
84
 
 
112
 
113
  ### Training results
114
 
115
+ | Training Loss | Epoch | Step | Validation Loss |
116
+ |:-------------:|:------:|:----:|:---------------:|
117
+ | No log | 0.1429 | 1 | 3.0789 |
 
 
 
 
 
118
 
119
 
120
  ### Framework versions