End of training

Browse files

Files changed (5) hide show

README.md +56 -10
adapter_config.json +2 -2
adapter_model.bin +1 -1
adapter_model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -65,11 +65,11 @@ lora_model_dir: null
 lora_r: 32
 lora_target_linear: true
 lr_scheduler: cosine
-max_steps: 100
 micro_batch_size: 1
 mlflow_experiment_name: /tmp/82fc59b447b3efcb_train_data.json
 model_type: AutoModelForCausalLM
-num_epochs: 4
 optimizer: adamw_bnb_8bit
 output_dir: miner_id_24
 pad_to_sequence_len: true
@@ -77,7 +77,7 @@ resume_from_checkpoint: null
 s2_attention: null
 sample_packing: false
 saves_per_epoch: 1
-sequence_len: 512
 special_tokens:
   pad_token: </s>
 strict: false
@@ -104,7 +104,7 @@ xformers_attention: null
 This model is a fine-tuned version of [TinyLlama/TinyLlama_v1.1](https://huggingface.co/TinyLlama/TinyLlama_v1.1) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.7032
 ## Model description
@@ -132,17 +132,63 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
-- training_steps: 100
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.5424        | 0.0010 | 1    | 1.4366          |
-| 0.8921        | 0.0238 | 25   | 0.9202          |
-| 0.5802        | 0.0475 | 50   | 0.7497          |
-| 0.723         | 0.0713 | 75   | 0.7097          |
-| 0.6108        | 0.0950 | 100  | 0.7032          |
 ### Framework versions

 lora_r: 32
 lora_target_linear: true
 lr_scheduler: cosine
+max_steps: 1000
 micro_batch_size: 1
 mlflow_experiment_name: /tmp/82fc59b447b3efcb_train_data.json
 model_type: AutoModelForCausalLM
+num_epochs: 50
 optimizer: adamw_bnb_8bit
 output_dir: miner_id_24
 pad_to_sequence_len: true
 s2_attention: null
 sample_packing: false
 saves_per_epoch: 1
+sequence_len: 1024
 special_tokens:
   pad_token: </s>
 strict: false
 This model is a fine-tuned version of [TinyLlama/TinyLlama_v1.1](https://huggingface.co/TinyLlama/TinyLlama_v1.1) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4983
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
+- training_steps: 1000
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.5421        | 0.0010 | 1    | 1.4359          |
+| 1.1304        | 0.0190 | 20   | 1.0104          |
+| 0.6992        | 0.0380 | 40   | 0.7770          |
+| 0.688         | 0.0570 | 60   | 0.7021          |
+| 0.5184        | 0.0760 | 80   | 0.6654          |
+| 0.5509        | 0.0950 | 100  | 0.6418          |
+| 0.6378        | 0.1140 | 120  | 0.6281          |
+| 0.6062        | 0.1330 | 140  | 0.6108          |
+| 0.8244        | 0.1520 | 160  | 0.6009          |
+| 0.5185        | 0.1710 | 180  | 0.5913          |
+| 0.5699        | 0.1900 | 200  | 0.5827          |
+| 0.4622        | 0.2090 | 220  | 0.5753          |
+| 0.4492        | 0.2280 | 240  | 0.5718          |
+| 0.5629        | 0.2470 | 260  | 0.5662          |
+| 0.5465        | 0.2660 | 280  | 0.5632          |
+| 0.376         | 0.2850 | 300  | 0.5571          |
+| 0.4478        | 0.3040 | 320  | 0.5522          |
+| 0.5251        | 0.3230 | 340  | 0.5496          |
+| 0.4852        | 0.3420 | 360  | 0.5444          |
+| 0.5344        | 0.3610 | 380  | 0.5419          |
+| 0.5464        | 0.3800 | 400  | 0.5381          |
+| 0.4565        | 0.3990 | 420  | 0.5354          |
+| 0.4654        | 0.4181 | 440  | 0.5314          |
+| 0.4963        | 0.4371 | 460  | 0.5277          |
+| 0.5259        | 0.4561 | 480  | 0.5268          |
+| 0.5111        | 0.4751 | 500  | 0.5241          |
+| 0.5169        | 0.4941 | 520  | 0.5222          |
+| 0.5947        | 0.5131 | 540  | 0.5183          |
+| 0.5295        | 0.5321 | 560  | 0.5172          |
+| 0.4934        | 0.5511 | 580  | 0.5151          |
+| 0.4575        | 0.5701 | 600  | 0.5135          |
+| 0.4981        | 0.5891 | 620  | 0.5115          |
+| 0.4236        | 0.6081 | 640  | 0.5093          |
+| 0.4831        | 0.6271 | 660  | 0.5095          |
+| 0.3917        | 0.6461 | 680  | 0.5072          |
+| 0.4254        | 0.6651 | 700  | 0.5056          |
+| 0.4732        | 0.6841 | 720  | 0.5043          |
+| 0.4753        | 0.7031 | 740  | 0.5033          |
+| 0.4428        | 0.7221 | 760  | 0.5026          |
+| 0.4353        | 0.7411 | 780  | 0.5011          |
+| 0.4548        | 0.7601 | 800  | 0.5007          |
+| 0.4652        | 0.7791 | 820  | 0.5001          |
+| 0.6047        | 0.7981 | 840  | 0.4996          |
+| 0.5564        | 0.8171 | 860  | 0.4993          |
+| 0.4263        | 0.8361 | 880  | 0.4991          |
+| 0.4986        | 0.8551 | 900  | 0.4989          |
+| 0.4395        | 0.8741 | 920  | 0.4986          |
+| 0.5258        | 0.8931 | 940  | 0.4984          |
+| 0.7145        | 0.9121 | 960  | 0.4982          |
+| 0.5519        | 0.9311 | 980  | 0.4983          |
+| 0.537         | 0.9501 | 1000 | 0.4983          |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,12 +20,12 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "o_proj",
     "up_proj",
     "v_proj",
-    "k_proj",
     "q_proj",
     "gate_proj",
     "down_proj"
   ],
   "task_type": "CAUSAL_LM",

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "up_proj",
     "v_proj",
     "q_proj",
     "gate_proj",
+    "o_proj",
+    "k_proj",
     "down_proj"
   ],
   "task_type": "CAUSAL_LM",

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b56c59e92adbb0b5be1dd6a1d97bdbf9f64a5efa53e440f9edf65cb1fc2c01cc
 size 101036698

 version https://git-lfs.github.com/spec/v1
+oid sha256:8dd578988f0ef0d5a53708f0074e8ea807cba197580d3148dd476cabb5bd46af
 size 101036698

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0f2e184ee90d310d0bd93e3af1eb755381119a5f5aa357d235a3886f3d035bdf
 size 100966336

 version https://git-lfs.github.com/spec/v1
+oid sha256:beff57809312474f7135477208c2c4986e685aa6e60620a07ca18a051bca48ac
 size 100966336

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0fa0e1624ac61226a96ff649ba0ceeb6bb69ede0baf51750832963a751952833
 size 6776

 version https://git-lfs.github.com/spec/v1
+oid sha256:0fecb60b53f9bd46f3ac8ad89c8c5a5b6e0a5190870a7fc83e500997378d0cc6
 size 6776