Jeremiah Zhou commited on
Commit
5e560cb
·
1 Parent(s): 533c8d2

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -0
README.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - generated_from_trainer
5
+ datasets:
6
+ - glue
7
+ metrics:
8
+ - spearmanr
9
+ model-index:
10
+ - name: roberta-base-stsb
11
+ results:
12
+ - task:
13
+ name: Text Classification
14
+ type: text-classification
15
+ dataset:
16
+ name: glue
17
+ type: glue
18
+ args: stsb
19
+ metrics:
20
+ - name: Spearmanr
21
+ type: spearmanr
22
+ value: 0.907904999413384
23
+ ---
24
+
25
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
26
+ should probably proofread and complete it, then remove this comment. -->
27
+
28
+ # roberta-base-stsb
29
+
30
+ This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the glue dataset.
31
+ It achieves the following results on the evaluation set:
32
+ - Loss: 0.4155
33
+ - Pearson: 0.9101
34
+ - Spearmanr: 0.9079
35
+ - Combined Score: 0.9090
36
+
37
+ ## Model description
38
+
39
+ More information needed
40
+
41
+ ## Intended uses & limitations
42
+
43
+ More information needed
44
+
45
+ ## Training and evaluation data
46
+
47
+ More information needed
48
+
49
+ ## Training procedure
50
+
51
+ ### Training hyperparameters
52
+
53
+ The following hyperparameters were used during training:
54
+ - learning_rate: 2e-05
55
+ - train_batch_size: 16
56
+ - eval_batch_size: 8
57
+ - seed: 42
58
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
59
+ - lr_scheduler_type: linear
60
+ - lr_scheduler_warmup_ratio: 0.06
61
+ - num_epochs: 10.0
62
+
63
+ ### Training results
64
+
65
+ | Training Loss | Epoch | Step | Validation Loss | Pearson | Spearmanr | Combined Score |
66
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:---------:|:--------------:|
67
+ | No log | 1.0 | 360 | 0.6202 | 0.8787 | 0.8813 | 0.8800 |
68
+ | 1.6425 | 2.0 | 720 | 0.4864 | 0.9008 | 0.8992 | 0.9000 |
69
+ | 0.3629 | 3.0 | 1080 | 0.4201 | 0.9043 | 0.9016 | 0.9030 |
70
+ | 0.3629 | 4.0 | 1440 | 0.4686 | 0.9052 | 0.9003 | 0.9027 |
71
+ | 0.2212 | 5.0 | 1800 | 0.4622 | 0.9061 | 0.9031 | 0.9046 |
72
+ | 0.1556 | 6.0 | 2160 | 0.3952 | 0.9086 | 0.9065 | 0.9075 |
73
+ | 0.1162 | 7.0 | 2520 | 0.4271 | 0.9081 | 0.9070 | 0.9075 |
74
+ | 0.1162 | 8.0 | 2880 | 0.4169 | 0.9094 | 0.9075 | 0.9085 |
75
+ | 0.0887 | 9.0 | 3240 | 0.4383 | 0.9091 | 0.9074 | 0.9083 |
76
+ | 0.0717 | 10.0 | 3600 | 0.4155 | 0.9101 | 0.9079 | 0.9090 |
77
+
78
+
79
+ ### Framework versions
80
+
81
+ - Transformers 4.20.0.dev0
82
+ - Pytorch 1.11.0+cu113
83
+ - Datasets 2.1.0
84
+ - Tokenizers 0.12.1