farid1088 commited on
Commit
027ddb3
verified
1 Parent(s): 42fde1c

Model save

Browse files
README.md CHANGED
@@ -12,8 +12,6 @@ should probably proofread and complete it, then remove this comment. -->
12
  # qa_bert_training
13
 
14
  This model was trained from scratch on the None dataset.
15
- It achieves the following results on the evaluation set:
16
- - Loss: 4.5796
17
 
18
  ## Model description
19
 
@@ -33,67 +31,18 @@ More information needed
33
 
34
  The following hyperparameters were used during training:
35
  - learning_rate: 2e-05
36
- - train_batch_size: 64
37
- - eval_batch_size: 4
38
  - seed: 42
39
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
  - lr_scheduler_type: linear
41
- - num_epochs: 50
42
 
43
  ### Training results
44
 
45
  | Training Loss | Epoch | Step | Validation Loss |
46
  |:-------------:|:-----:|:----:|:---------------:|
47
- | No log | 1.0 | 5 | 4.4633 |
48
- | No log | 2.0 | 10 | 4.1459 |
49
- | No log | 3.0 | 15 | 3.8142 |
50
- | No log | 4.0 | 20 | 3.5498 |
51
- | No log | 5.0 | 25 | 3.4453 |
52
- | No log | 6.0 | 30 | 3.2400 |
53
- | No log | 7.0 | 35 | 3.2037 |
54
- | No log | 8.0 | 40 | 3.1653 |
55
- | No log | 9.0 | 45 | 3.0743 |
56
- | No log | 10.0 | 50 | 3.0701 |
57
- | No log | 11.0 | 55 | 3.1443 |
58
- | No log | 12.0 | 60 | 3.1234 |
59
- | No log | 13.0 | 65 | 3.1602 |
60
- | No log | 14.0 | 70 | 3.2970 |
61
- | No log | 15.0 | 75 | 3.5709 |
62
- | No log | 16.0 | 80 | 3.3519 |
63
- | No log | 17.0 | 85 | 3.4430 |
64
- | No log | 18.0 | 90 | 3.5896 |
65
- | No log | 19.0 | 95 | 3.5401 |
66
- | No log | 20.0 | 100 | 3.8490 |
67
- | No log | 21.0 | 105 | 3.8165 |
68
- | No log | 22.0 | 110 | 3.8282 |
69
- | No log | 23.0 | 115 | 4.0689 |
70
- | No log | 24.0 | 120 | 3.9270 |
71
- | No log | 25.0 | 125 | 4.0555 |
72
- | No log | 26.0 | 130 | 4.0671 |
73
- | No log | 27.0 | 135 | 4.1758 |
74
- | No log | 28.0 | 140 | 4.2107 |
75
- | No log | 29.0 | 145 | 4.0869 |
76
- | No log | 30.0 | 150 | 4.3414 |
77
- | No log | 31.0 | 155 | 4.2039 |
78
- | No log | 32.0 | 160 | 4.3510 |
79
- | No log | 33.0 | 165 | 4.3739 |
80
- | No log | 34.0 | 170 | 4.4402 |
81
- | No log | 35.0 | 175 | 4.3934 |
82
- | No log | 36.0 | 180 | 4.5000 |
83
- | No log | 37.0 | 185 | 4.3904 |
84
- | No log | 38.0 | 190 | 4.4441 |
85
- | No log | 39.0 | 195 | 4.5329 |
86
- | No log | 40.0 | 200 | 4.5319 |
87
- | No log | 41.0 | 205 | 4.4463 |
88
- | No log | 42.0 | 210 | 4.5857 |
89
- | No log | 43.0 | 215 | 4.3889 |
90
- | No log | 44.0 | 220 | 4.5061 |
91
- | No log | 45.0 | 225 | 4.5397 |
92
- | No log | 46.0 | 230 | 4.5463 |
93
- | No log | 47.0 | 235 | 4.5477 |
94
- | No log | 48.0 | 240 | 4.6320 |
95
- | No log | 49.0 | 245 | 4.6022 |
96
- | No log | 50.0 | 250 | 4.5796 |
97
 
98
 
99
  ### Framework versions
 
12
  # qa_bert_training
13
 
14
  This model was trained from scratch on the None dataset.
 
 
15
 
16
  ## Model description
17
 
 
31
 
32
  The following hyperparameters were used during training:
33
  - learning_rate: 2e-05
34
+ - train_batch_size: 16
35
+ - eval_batch_size: 1
36
  - seed: 42
37
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
38
  - lr_scheduler_type: linear
39
+ - num_epochs: 1
40
 
41
  ### Training results
42
 
43
  | Training Loss | Epoch | Step | Validation Loss |
44
  |:-------------:|:-----:|:----:|:---------------:|
45
+ | No log | 1.0 | 20 | 4.1789 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
 
48
  ### Framework versions
emissions.csv CHANGED
@@ -1,2 +1,2 @@
1
  timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2
- 2024-02-23T23:09:47,codecarbon,a901e253-6179-463e-83bd-92f8791a4bb3,155.24780178070068,0.020808917586413248,0.00013403679374351094,112.5,496.249,377.89256858825684,0.004851267136633397,0.037720484018892046,0.016157910662601423,0.05872966181812684,Germany,DEU,free and hanseatic city of hamburg,,,Linux-5.15.0-76-generic-x86_64-with-glibc2.35,3.11.4,2.2.3,128,AMD EPYC 7543 32-Core Processor,4,4 x NVIDIA A40,9.9683,53.5649,1007.7135162353516,machine,N,1.0
 
1
  timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2
+ 2024-02-28T17:39:10,codecarbon,a1bacad7-600d-43d1-9529-499f65459825,66.95561790466309,0.0032314086029868005,4.8261948797006786e-05,112.5,0.0,377.8920850753784,0.0020923272147774694,0,0.00702777880042022,0.009120106015197691,Germany,DEU,free and hanseatic city of hamburg,,,Linux-5.15.0-91-generic-x86_64-with-glibc2.35,3.11.4,2.2.3,128,AMD EPYC 7543 32-Core Processor,,,9.9683,53.5649,1007.7122268676758,machine,N,1.0
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b61e7a301d48f5ccf3c683002ed12d051ef3401a1c1cd94a31810ab91b808e30
3
  size 433992496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c83627bb69e3d48b546ab6366cbe694a3de1316c2ddf79bc52d1c41b379b4a12
3
  size 433992496
runs/Feb24_07-33-01_hcdsgpu2/events.out.tfevents.1708759987.hcdsgpu2.3536120.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d11543fda95afc4ecf334d8484a44378038b15c6eb21777ed135fd843dc5ba7
3
+ size 88
runs/Feb24_07-37-15_hcdsgpu2/events.out.tfevents.1708760242.hcdsgpu2.3537879.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89d5dbec87247876b1fc6c04814376e67afbb05204af9d4fbb72969b49d0be8d
3
+ size 4406
runs/Feb24_07-40-52_hcdsgpu2/events.out.tfevents.1708760460.hcdsgpu2.3539549.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b365ec73bdf85ad075187309f98fd1f0a7d6d247c7f9213fd7a3ed0c4c81123
3
+ size 88
runs/Feb24_07-42-34_hcdsgpu2/events.out.tfevents.1708760561.hcdsgpu2.3540534.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe5ecea56153e2ec10b5952bb4cfd800fe020ab0a44ce51d962598fec5cfc2d7
3
+ size 4406
runs/Feb24_07-48-02_hcdsgpu2/events.out.tfevents.1708760889.hcdsgpu2.3542653.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca1f93b6fbafd35420b453cea51b1a4e3fa03ac999cec49699f17638f98c8e42
3
+ size 4406
runs/Feb24_08-20-42_hcdsgpu2/events.out.tfevents.1708762848.hcdsgpu2.3558024.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bafd21665e712f12faaeff7eb8866332f675861fd9ca39a606d59d66d5cb797f
3
+ size 88
runs/Feb28_17-37-58_hcdsgpu1/events.out.tfevents.1709141883.hcdsgpu1.836927.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ac24489d8a7e0c8d80b4f9b0570d28a219cc3039298d455c0333eeb20c96ba6
3
+ size 5020
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f23f3fcc636eaf325079428465594aa379803a8bc912a0149d613cc2c1f6e1dd
3
  size 4728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:104d6f7d9c1d0cd06e380ab16159fe6ca75b9223b25666c98f10385c5add4c02
3
  size 4728