error577 commited on
Commit
54c1cb1
·
verified ·
1 Parent(s): 5ae9aae

End of training

Browse files
Files changed (2) hide show
  1. README.md +12 -7
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -70,7 +70,7 @@ max_steps: 1000
70
  micro_batch_size: 2
71
  mlflow_experiment_name: /tmp/425c6bf4bb96a710_train_data.json
72
  model_type: AutoModelForCausalLM
73
- num_epochs: 1
74
  optimizer: adamw_bnb_8bit
75
  output_dir: miner_id_24
76
  pad_to_sequence_len: true
@@ -105,7 +105,7 @@ xformers_attention: null
105
 
106
  This model is a fine-tuned version of [JackFram/llama-68m](https://huggingface.co/JackFram/llama-68m) on the None dataset.
107
  It achieves the following results on the evaluation set:
108
- - Loss: 1.1382
109
 
110
  ## Model description
111
 
@@ -133,17 +133,22 @@ The following hyperparameters were used during training:
133
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
134
  - lr_scheduler_type: cosine
135
  - lr_scheduler_warmup_steps: 10
136
- - training_steps: 194
137
 
138
  ### Training results
139
 
140
  | Training Loss | Epoch | Step | Validation Loss |
141
  |:-------------:|:------:|:----:|:---------------:|
142
  | 5.8379 | 0.0052 | 1 | 5.7354 |
143
- | 0.9893 | 0.2020 | 39 | 1.3895 |
144
- | 1.0758 | 0.4040 | 78 | 1.2282 |
145
- | 0.8513 | 0.6060 | 117 | 1.2148 |
146
- | 1.8196 | 0.8080 | 156 | 1.1382 |
 
 
 
 
 
147
 
148
 
149
  ### Framework versions
 
70
  micro_batch_size: 2
71
  mlflow_experiment_name: /tmp/425c6bf4bb96a710_train_data.json
72
  model_type: AutoModelForCausalLM
73
+ num_epochs: 2
74
  optimizer: adamw_bnb_8bit
75
  output_dir: miner_id_24
76
  pad_to_sequence_len: true
 
105
 
106
  This model is a fine-tuned version of [JackFram/llama-68m](https://huggingface.co/JackFram/llama-68m) on the None dataset.
107
  It achieves the following results on the evaluation set:
108
+ - Loss: 1.0955
109
 
110
  ## Model description
111
 
 
133
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
134
  - lr_scheduler_type: cosine
135
  - lr_scheduler_warmup_steps: 10
136
+ - training_steps: 387
137
 
138
  ### Training results
139
 
140
  | Training Loss | Epoch | Step | Validation Loss |
141
  |:-------------:|:------:|:----:|:---------------:|
142
  | 5.8379 | 0.0052 | 1 | 5.7354 |
143
+ | 0.9867 | 0.2020 | 39 | 1.4082 |
144
+ | 1.0126 | 0.4040 | 78 | 1.2563 |
145
+ | 0.8883 | 0.6060 | 117 | 1.1930 |
146
+ | 1.8973 | 0.8080 | 156 | 1.1759 |
147
+ | 2.993 | 1.0100 | 195 | 1.2300 |
148
+ | 0.5959 | 1.2120 | 234 | 1.1373 |
149
+ | 0.7068 | 1.4140 | 273 | 1.1433 |
150
+ | 0.9381 | 1.6161 | 312 | 1.0941 |
151
+ | 0.8364 | 1.8181 | 351 | 1.0955 |
152
 
153
 
154
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7c1d810e6fa575348b45a2e58a0f90ce76f2a2aa02a8a09d1e48bc87e3c94ef9
3
  size 1140674
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a74cf4a46e9565ad44f480a403506334218e20d174344761478805222485eb0b
3
  size 1140674