farid1088 commited on
Commit
c6a6b7a
verified
1 Parent(s): 5e49eb6

Model save

Browse files
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ datasets:
5
+ - germanquad
6
+ model-index:
7
+ - name: GQA_BERT7
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # GQA_BERT7
15
+
16
+ This model was trained from scratch on the germanquad dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 4.0355
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 2e-05
38
+ - train_batch_size: 40
39
+ - eval_batch_size: 40
40
+ - seed: 42
41
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
+ - lr_scheduler_type: linear
43
+ - num_epochs: 7
44
+
45
+ ### Training results
46
+
47
+ | Training Loss | Epoch | Step | Validation Loss |
48
+ |:-------------:|:-----:|:----:|:---------------:|
49
+ | No log | 1.0 | 288 | 5.4229 |
50
+ | 4.5272 | 2.0 | 576 | 4.2288 |
51
+ | 4.5272 | 3.0 | 864 | 3.9695 |
52
+ | 3.5782 | 4.0 | 1152 | 4.0760 |
53
+ | 3.5782 | 5.0 | 1440 | 3.7674 |
54
+ | 3.0376 | 6.0 | 1728 | 3.9112 |
55
+ | 2.7057 | 7.0 | 2016 | 4.0355 |
56
+
57
+
58
+ ### Framework versions
59
+
60
+ - Transformers 4.36.2
61
+ - Pytorch 2.1.2+cu121
62
+ - Datasets 2.14.7
63
+ - Tokenizers 0.15.0
emissions.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ timestamp,project_name,run_id,duration,emissions,emissions_rate,cpu_power,gpu_power,ram_power,cpu_energy,gpu_energy,ram_energy,energy_consumed,country_name,country_iso_code,region,cloud_provider,cloud_region,os,python_version,codecarbon_version,cpu_count,cpu_model,gpu_count,gpu_model,longitude,latitude,ram_total_size,tracking_mode,on_cloud,pue
2
+ 2024-03-02T18:49:27,codecarbon,39883f36-6b7b-4636-86da-a887725a4e63,1916.1745176315308,0.22258256135596935,0.00011615985877481055,112.5,693.794,377.8920922279358,0.05987949199974538,0.36775126928149526,0.2005709940846306,0.6282017553658711,Germany,DEU,free and hanseatic city of hamburg,,,Linux-5.15.0-97-generic-x86_64-with-glibc2.35,3.11.4,2.2.3,128,AMD EPYC 7543 32-Core Processor,5,4 x NVIDIA A401 x NVIDIA A100 80GB PCIe,9.9683,53.5649,1007.7122459411621,machine,N,1.0
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5aa826836c5ed396556f835a5154872f6b7e4dfdd44bf89f9a2730c7e076e96e
3
  size 433992496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1359f8cf5ecb8cf5f2f3fbf54ef165417e1a96c6ab44798f3cd8a38a5e911a9
3
  size 433992496
runs/Mar02_18-17-25_hcdsgpu1/events.out.tfevents.1709403451.hcdsgpu1.296199.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c037e79ec117ba79964107b9625a1ce2b10061df3224afa5844bb92e696e5316
3
- size 6794
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aee8eb917ef998b230b1d585333204191c6f8059cca0ba5f34c8a82f560e0070
3
+ size 7419