baby-dev
/

tiny-test-0

Generated from Trainer

Model card Files Files and versions

baby-dev commited on Feb 7, 2025

Commit

4f0a2a9

·

verified ·

1 Parent(s): c946400

End of training

Files changed (2) hide show

README.md +6 -8
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -71,11 +71,11 @@ lr_scheduler: linear
 max_grad_norm: 1.0
 max_memory:
   0: 75GB
-max_steps: 1000
 micro_batch_size: 4
 mlflow_experiment_name: /tmp/00f5ac3cc66d870f_train_data.json
 model_type: AutoModelForCausalLM
-num_epochs: 3
 optim_args:
   adam_beta1: 0.9
   adam_beta2: 0.95
@@ -112,7 +112,7 @@ xformers_attention: null
 This model is a fine-tuned version of [fxmarty/tiny-dummy-qwen2](https://huggingface.co/fxmarty/tiny-dummy-qwen2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 11.9163
 ## Model description
@@ -140,16 +140,14 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 5
-- training_steps: 358
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 11.9208       | 0.9979 | 119  | 11.9190         |
-| 11.9206       | 1.9958 | 238  | 11.9168         |
-| 11.916        | 2.9937 | 357  | 11.9166         |
-| 11.916        | 3.0021 | 358  | 11.9163         |
 ### Framework versions

 max_grad_norm: 1.0
 max_memory:
   0: 75GB
+max_steps: 200
 micro_batch_size: 4
 mlflow_experiment_name: /tmp/00f5ac3cc66d870f_train_data.json
 model_type: AutoModelForCausalLM
+num_epochs: 100
 optim_args:
   adam_beta1: 0.9
   adam_beta2: 0.95
 This model is a fine-tuned version of [fxmarty/tiny-dummy-qwen2](https://huggingface.co/fxmarty/tiny-dummy-qwen2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 11.9184
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 5
+- training_steps: 200
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 11.9214       | 0.9979 | 119  | 11.9196         |
+| 11.9219       | 1.6771 | 200  | 11.9184         |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:577b1ea719f98665cedf01f643f3b812b9a404b2b2170e6b24314e51db25c742
 size 55170

 version https://git-lfs.github.com/spec/v1
+oid sha256:a51735f763e2d089d3ea67793f9c7868b8c558d2768dfb12bf518086b4d2a481
 size 55170