Model save

Browse files

Files changed (11) hide show

README.md +4 -55
emissions.csv +1 -1
model.safetensors +1 -1
runs/Feb24_07-33-01_hcdsgpu2/events.out.tfevents.1708759987.hcdsgpu2.3536120.0 +3 -0
runs/Feb24_07-37-15_hcdsgpu2/events.out.tfevents.1708760242.hcdsgpu2.3537879.0 +3 -0
runs/Feb24_07-40-52_hcdsgpu2/events.out.tfevents.1708760460.hcdsgpu2.3539549.0 +3 -0
runs/Feb24_07-42-34_hcdsgpu2/events.out.tfevents.1708760561.hcdsgpu2.3540534.0 +3 -0
runs/Feb24_07-48-02_hcdsgpu2/events.out.tfevents.1708760889.hcdsgpu2.3542653.0 +3 -0
runs/Feb24_08-20-42_hcdsgpu2/events.out.tfevents.1708762848.hcdsgpu2.3558024.0 +3 -0
runs/Feb28_17-37-58_hcdsgpu1/events.out.tfevents.1709141883.hcdsgpu1.836927.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -12,8 +12,6 @@ should probably proofread and complete it, then remove this comment. -->
 # qa_bert_training
 This model was trained from scratch on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 4.5796
 ## Model description
@@ -33,67 +31,18 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 64
-- eval_batch_size: 4
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 50
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 5    | 4.4633          |
-| No log        | 2.0   | 10   | 4.1459          |
-| No log        | 3.0   | 15   | 3.8142          |
-| No log        | 4.0   | 20   | 3.5498          |
-| No log        | 5.0   | 25   | 3.4453          |
-| No log        | 6.0   | 30   | 3.2400          |
-| No log        | 7.0   | 35   | 3.2037          |
-| No log        | 8.0   | 40   | 3.1653          |
-| No log        | 9.0   | 45   | 3.0743          |
-| No log        | 10.0  | 50   | 3.0701          |
-| No log        | 11.0  | 55   | 3.1443          |
-| No log        | 12.0  | 60   | 3.1234          |
-| No log        | 13.0  | 65   | 3.1602          |
-| No log        | 14.0  | 70   | 3.2970          |
-| No log        | 15.0  | 75   | 3.5709          |
-| No log        | 16.0  | 80   | 3.3519          |
-| No log        | 17.0  | 85   | 3.4430          |
-| No log        | 18.0  | 90   | 3.5896          |
-| No log        | 19.0  | 95   | 3.5401          |
-| No log        | 20.0  | 100  | 3.8490          |
-| No log        | 21.0  | 105  | 3.8165          |
-| No log        | 22.0  | 110  | 3.8282          |
-| No log        | 23.0  | 115  | 4.0689          |
-| No log        | 24.0  | 120  | 3.9270          |
-| No log        | 25.0  | 125  | 4.0555          |
-| No log        | 26.0  | 130  | 4.0671          |
-| No log        | 27.0  | 135  | 4.1758          |
-| No log        | 28.0  | 140  | 4.2107          |
-| No log        | 29.0  | 145  | 4.0869          |
-| No log        | 30.0  | 150  | 4.3414          |
-| No log        | 31.0  | 155  | 4.2039          |
-| No log        | 32.0  | 160  | 4.3510          |
-| No log        | 33.0  | 165  | 4.3739          |
-| No log        | 34.0  | 170  | 4.4402          |
-| No log        | 35.0  | 175  | 4.3934          |
-| No log        | 36.0  | 180  | 4.5000          |
-| No log        | 37.0  | 185  | 4.3904          |
-| No log        | 38.0  | 190  | 4.4441          |
-| No log        | 39.0  | 195  | 4.5329          |
-| No log        | 40.0  | 200  | 4.5319          |
-| No log        | 41.0  | 205  | 4.4463          |
-| No log        | 42.0  | 210  | 4.5857          |
-| No log        | 43.0  | 215  | 4.3889          |
-| No log        | 44.0  | 220  | 4.5061          |
-| No log        | 45.0  | 225  | 4.5397          |
-| No log        | 46.0  | 230  | 4.5463          |
-| No log        | 47.0  | 235  | 4.5477          |
-| No log        | 48.0  | 240  | 4.6320          |
-| No log        | 49.0  | 245  | 4.6022          |
-| No log        | 50.0  | 250  | 4.5796          |
 ### Framework versions

 # qa_bert_training
 This model was trained from scratch on the None dataset.
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 1
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 1
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 20   | 4.1789          |
 ### Framework versions

emissions.csv CHANGED Viewed

	@@ -1,2 +1,2 @@
1	timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2	- 2024-02-~~23T23~~:09:47,codecarbon,~~a901e253~~-~~6179~~-~~463e~~-~~83bd~~-~~92f8791a4bb3~~,~~155~~.~~24780178070068~~,0.~~020808917586413248~~,0.~~00013403679374351094~~,112.5,~~496~~.~~249~~,377.~~89256858825684~~,0.~~004851267136633397~~,0~~.037720484018892046~~,0.~~016157910662601423~~,0.~~05872966181812684~~,Germany,DEU,free and hanseatic city of hamburg,,,Linux-5.15.0-76-generic-x86_64-with-glibc2.35,3.11.4,2.2.3,128,AMD EPYC 7543 32-Core Processor,4,~~4 x NVIDIA A40~~,9.9683,53.5649,1007.~~7135162353516~~,machine,N,1.0


1	timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2	+ 2024-02-28T17:39:10,codecarbon,a1bacad7-600d-43d1-9529-499f65459825,66.95561790466309,0.0032314086029868005,4.8261948797006786e-05,112.5,0.0,377.8920850753784,0.0020923272147774694,0,0.00702777880042022,0.009120106015197691,Germany,DEU,free and hanseatic city of hamburg,,,Linux-5.15.0-91-generic-x86_64-with-glibc2.35,3.11.4,2.2.3,128,AMD EPYC 7543 32-Core Processor,,,9.9683,53.5649,1007.7122268676758,machine,N,1.0

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b61e7a301d48f5ccf3c683002ed12d051ef3401a1c1cd94a31810ab91b808e30
 size 433992496

 version https://git-lfs.github.com/spec/v1
+oid sha256:c83627bb69e3d48b546ab6366cbe694a3de1316c2ddf79bc52d1c41b379b4a12
 size 433992496

runs/Feb24_07-33-01_hcdsgpu2/events.out.tfevents.1708759987.hcdsgpu2.3536120.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1d11543fda95afc4ecf334d8484a44378038b15c6eb21777ed135fd843dc5ba7
+size 88

runs/Feb24_07-37-15_hcdsgpu2/events.out.tfevents.1708760242.hcdsgpu2.3537879.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:89d5dbec87247876b1fc6c04814376e67afbb05204af9d4fbb72969b49d0be8d
+size 4406

runs/Feb24_07-40-52_hcdsgpu2/events.out.tfevents.1708760460.hcdsgpu2.3539549.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b365ec73bdf85ad075187309f98fd1f0a7d6d247c7f9213fd7a3ed0c4c81123
+size 88

runs/Feb24_07-42-34_hcdsgpu2/events.out.tfevents.1708760561.hcdsgpu2.3540534.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fe5ecea56153e2ec10b5952bb4cfd800fe020ab0a44ce51d962598fec5cfc2d7
+size 4406

runs/Feb24_07-48-02_hcdsgpu2/events.out.tfevents.1708760889.hcdsgpu2.3542653.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ca1f93b6fbafd35420b453cea51b1a4e3fa03ac999cec49699f17638f98c8e42
+size 4406

runs/Feb24_08-20-42_hcdsgpu2/events.out.tfevents.1708762848.hcdsgpu2.3558024.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bafd21665e712f12faaeff7eb8866332f675861fd9ca39a606d59d66d5cb797f
+size 88

runs/Feb28_17-37-58_hcdsgpu1/events.out.tfevents.1709141883.hcdsgpu1.836927.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9ac24489d8a7e0c8d80b4f9b0570d28a219cc3039298d455c0333eeb20c96ba6
+size 5020

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f23f3fcc636eaf325079428465594aa379803a8bc912a0149d613cc2c1f6e1dd
 size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:104d6f7d9c1d0cd06e380ab16159fe6ca75b9223b25666c98f10385c5add4c02
 size 4728