Model save

Browse files

Files changed (3) hide show

README.md +16 -22
adapter_model.safetensors +1 -1
runs/May28_07-41-31_ae63705f58eb/events.out.tfevents.1716882099.ae63705f58eb.96253.0 +2 -2

README.md CHANGED Viewed

@@ -1,8 +1,9 @@
 ---
 license: gemma
-base_model: google/paligemma-3b-pt-224
 tags:
 - generated_from_trainer
 datasets:
 - vq_av2
 model-index:
@@ -13,12 +14,11 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/statking/huggingface/runs/xgb0dent)
 # paligemma-vqa
 This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0001
 ## Model description
@@ -37,39 +37,33 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.02
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 1200
-- num_epochs: 1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 0.0019        | 0.0736 | 500  | 0.0081          |
-| 0.0004        | 0.1472 | 1000 | 0.0002          |
-| 0.0003        | 0.2207 | 1500 | 0.0002          |
-| 0.0001        | 0.2943 | 2000 | 0.0001          |
-| 0.0001        | 0.3679 | 2500 | 0.0001          |
-| 0.0001        | 0.4415 | 3000 | 0.0001          |
-| 0.0002        | 0.5151 | 3500 | 0.0002          |
-| 0.0001        | 0.5886 | 4000 | 0.0001          |
-| 0.0001        | 0.6622 | 4500 | 0.0001          |
-| 0.0001        | 0.7358 | 5000 | 0.0001          |
-| 0.0001        | 0.8094 | 5500 | 0.0001          |
-| 0.0001        | 0.8830 | 6000 | 0.0001          |
-| 0.0001        | 0.9566 | 6500 | 0.0001          |
 ### Framework versions
-- Transformers 4.41.0
 - Pytorch 2.2.0+cu121
 - Datasets 2.19.1
-- Tokenizers 0.19.1

 ---
 license: gemma
+library_name: peft
 tags:
 - generated_from_trainer
+base_model: google/paligemma-3b-pt-224
 datasets:
 - vq_av2
 model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # paligemma-vqa
 This model is a fine-tuned version of [google/paligemma-3b-pt-224](https://huggingface.co/google/paligemma-3b-pt-224) on the vq_av2 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5071
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 256
+- total_eval_batch_size: 64
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- num_epochs: 2
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 0.5618        | 0.5886 | 1000 | 0.5531          |
+| 0.5268        | 1.1772 | 2000 | 0.5335          |
+| 0.5099        | 1.7657 | 3000 | 0.5071          |
 ### Framework versions
+- PEFT 0.11.1
+- Transformers 4.41.1
 - Pytorch 2.2.0+cu121
 - Datasets 2.19.1
+- Tokenizers 0.19.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e477bed6630ea7d3f1d5af070b0c2d7b2440fac4e0fb1bd9907d5e9caf337034
 size 45258384

 version https://git-lfs.github.com/spec/v1
+oid sha256:36baffd20b87b8b1431cfe8c35045dd83c082f3d404894a0dd11bf78dc0ce038
 size 45258384

runs/May28_07-41-31_ae63705f58eb/events.out.tfevents.1716882099.ae63705f58eb.96253.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a052271f8f2ea7195a5cb56bd1ad39aa1429c59c6c6d034cd18c242044bfce94
-size 12758

 version https://git-lfs.github.com/spec/v1
+oid sha256:d891a13fe1e0e0b6927a7fcbf3ba26f10c4c09fa53d46a77c0d962c9c485e215
+size 13745