besimray
/

test

@@ -42,12 +42,12 @@ group_by_length: false
 hub_model_id: besimray/test
 hub_strategy: checkpoint
 hub_token: null
-learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: true
 local_rank: null
 logging_steps: 1
-lora_alpha: 32
 lora_dropout: 0.05
 lora_fan_in_fan_out: null
 lora_model_dir: null
@@ -61,7 +61,7 @@ model_type: LlamaForCausalLM
 num_epochs: 4
 optimizer: adamw_bnb_8bit
 output_dir: miner_id_besimray
-pad_to_sequence_len: true
 resume_from_checkpoint: null
 s2_attention: null
 sample_packing: false
@@ -90,7 +90,7 @@ xformers_attention: null
 This model is a fine-tuned version of [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.2052
 ## Model description
@@ -109,7 +109,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0002
 - train_batch_size: 7
 - eval_batch_size: 7
 - seed: 42
@@ -124,16 +124,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 1.3357        | 0.0147 | 1    | 1.2696          |
-| 1.1846        | 0.0294 | 2    | 1.2671          |
-| 1.5764        | 0.0441 | 3    | 1.2609          |
-| 1.3116        | 0.0588 | 4    | 1.2478          |
-| 1.3583        | 0.0735 | 5    | 1.2308          |
-| 1.3894        | 0.0882 | 6    | 1.2229          |
-| 1.243         | 0.1029 | 7    | 1.2255          |
-| 1.4176        | 0.1176 | 8    | 1.2249          |
-| 1.3973        | 0.1324 | 9    | 1.2156          |
-| 1.3676        | 0.1471 | 10   | 1.2052          |
 ### Framework versions

 hub_model_id: besimray/test
 hub_strategy: checkpoint
 hub_token: null
+learning_rate: 5.0e-05
 load_in_4bit: false
 load_in_8bit: true
 local_rank: null
 logging_steps: 1
+lora_alpha: 64
 lora_dropout: 0.05
 lora_fan_in_fan_out: null
 lora_model_dir: null
 num_epochs: 4
 optimizer: adamw_bnb_8bit
 output_dir: miner_id_besimray
+pad_to_sequence_len: false
 resume_from_checkpoint: null
 s2_attention: null
 sample_packing: false
 This model is a fine-tuned version of [unsloth/Llama-3.2-1B-Instruct](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.2202
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
 - train_batch_size: 7
 - eval_batch_size: 7
 - seed: 42
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 1.3327        | 0.0147 | 1    | 1.2694          |
+| 1.1887        | 0.0294 | 2    | 1.2705          |
+| 1.5717        | 0.0441 | 3    | 1.2656          |
+| 1.3113        | 0.0588 | 4    | 1.2619          |
+| 1.3671        | 0.0735 | 5    | 1.2536          |
+| 1.4151        | 0.0882 | 6    | 1.2436          |
+| 1.2607        | 0.1029 | 7    | 1.2301          |
+| 1.4189        | 0.1176 | 8    | 1.2256          |
+| 1.3843        | 0.1324 | 9    | 1.2237          |
+| 1.3753        | 0.1471 | 10   | 1.2202          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:04395dced9684a18e5d52dd2a6c1bec536ac3497b627a50216c11ca4ded14e18
 size 67713738

 version https://git-lfs.github.com/spec/v1
+oid sha256:c2295b41ac661bb1f048c5ea31fe90887943731c0624ee1b94ce0b0510bf55c3
 size 67713738