DeepDream2045
/

4bcc03c8-df85-4a35-aec2-a35701f2914d

Generated from Trainer

Model card Files Files and versions

DeepDream2045 commited on Dec 16, 2024

Commit

572de99

·

verified ·

1 Parent(s): 50a392c

End of training

Files changed (2) hide show

README.md +6 -6
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -74,7 +74,7 @@ optimizer: adamw_torch
 output_dir: miner_id_24
 pad_to_sequence_len: true
 resume_from_checkpoint: null
-s2_attention: false
 sample_packing: false
 save_steps: 25
 sequence_len: 2048
@@ -85,14 +85,14 @@ train_on_inputs: false
 trust_remote_code: true
 val_set_size: 0.05
 wandb_entity: null
-wandb_mode: online
 wandb_name: 4bcc03c8-df85-4a35-aec2-a35701f2914d
 wandb_project: Gradients-On-Demand
 wandb_run: your_name
 wandb_runid: 4bcc03c8-df85-4a35-aec2-a35701f2914d
 warmup_ratio: 0.05
 weight_decay: 0.01
-xformers_attention: false
 ```
@@ -102,7 +102,7 @@ xformers_attention: false
 This model is a fine-tuned version of [Xenova/tiny-random-Phi3ForCausalLM](https://huggingface.co/Xenova/tiny-random-Phi3ForCausalLM) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 10.3641
 ## Model description
@@ -137,8 +137,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 10.3798       | 0.0007 | 1    | 10.3799         |
-| 10.3664       | 0.0164 | 25   | 10.3686         |
-| 10.3805       | 0.0328 | 50   | 10.3641         |
 ### Framework versions

 output_dir: miner_id_24
 pad_to_sequence_len: true
 resume_from_checkpoint: null
+s2_attention: null
 sample_packing: false
 save_steps: 25
 sequence_len: 2048
 trust_remote_code: true
 val_set_size: 0.05
 wandb_entity: null
+wandb_mode: disabled
 wandb_name: 4bcc03c8-df85-4a35-aec2-a35701f2914d
 wandb_project: Gradients-On-Demand
 wandb_run: your_name
 wandb_runid: 4bcc03c8-df85-4a35-aec2-a35701f2914d
 warmup_ratio: 0.05
 weight_decay: 0.01
+xformers_attention: true
 ```
 This model is a fine-tuned version of [Xenova/tiny-random-Phi3ForCausalLM](https://huggingface.co/Xenova/tiny-random-Phi3ForCausalLM) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 10.3652
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 10.3798       | 0.0007 | 1    | 10.3799         |
+| 10.3674       | 0.0164 | 25   | 10.3696         |
+| 10.379        | 0.0328 | 50   | 10.3652         |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:01fe1ee7b38326f2ab85001f6a79848d60870e11682e4a021beb930f0afda060
 size 120926

 version https://git-lfs.github.com/spec/v1
+oid sha256:4115140bc4c389927425c6be28ec28faaa8660b6e3f0b87bf5b74de8b7566188
 size 120926