fdyrd commited on
Commit
8c682f4
·
verified ·
1 Parent(s): e4d4193

Update README: add training and validation info

Browse files
Files changed (1) hide show
  1. README.md +100 -1
README.md CHANGED
@@ -9,4 +9,103 @@ base_model:
9
  library_name: transformers
10
  tags:
11
  - text-generation-inference
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  library_name: transformers
10
  tags:
11
  - text-generation-inference
12
+ ---
13
+
14
+ # QwenMath
15
+
16
+ A generation LLM which can solve math problems.
17
+
18
+ ## Training Statistics
19
+ ```yaml
20
+ training-method: lora
21
+ training-time: "5:42"
22
+ data-size: 500
23
+ epoch: 3
24
+ total_flos: "1372250GF"
25
+ train_loss: 0.6441
26
+ train_samples_per_second: 4.385
27
+ train_steps_per_second: 0.544
28
+ ```
29
+
30
+ ## Validation Set Performance
31
+ Dataset used: test split of [fdyrd/MATH](https://huggingface.co/datasets/fdyrd/MATH).
32
+
33
+ <table>
34
+ <tr>
35
+ <th> Level </th>
36
+ <th> Algebra </th>
37
+ <th> Intermediate Algebra </th>
38
+ <th> Prealgebra </th>
39
+ <th> Precalculus </th>
40
+ <th> Number Theory </th>
41
+ <th> Geometry </th>
42
+ <th> Counting & Probability </th>
43
+ <th> Average </th>
44
+ </tr>
45
+ <tr>
46
+ <td> Level 1 </td>
47
+ <td> 0.541 : 135 </td>
48
+ <td> 0.192 : 52 </td>
49
+ <td> 0.477 : 86 </td>
50
+ <td> 0.228 : 57 </td>
51
+ <td> 0.467 : 30 </td>
52
+ <td> 0.263 : 38 </td>
53
+ <td> 0.359 : 39 </td>
54
+ <td> 0.361 </td>
55
+ </tr>
56
+ <tr>
57
+ <td> Level 2 </td>
58
+ <td> 0.323 : 201 </td>
59
+ <td> 0.109 : 128 </td>
60
+ <td> 0.367 : 177 </td>
61
+ <td> 0.044 : 113 </td>
62
+ <td> 0.38 : 92 </td>
63
+ <td> 0.134 : 82 </td>
64
+ <td> 0.248 : 101 </td>
65
+ <td> 0.229 </td>
66
+ </tr>
67
+ <tr>
68
+ <td> Level 3 </td>
69
+ <td> 0.291 : 261 </td>
70
+ <td> 0.046 : 195 </td>
71
+ <td> 0.308 : 224 </td>
72
+ <td> 0.0 : 127 </td>
73
+ <td> 0.262 : 122 </td>
74
+ <td> 0.088 : 102 </td>
75
+ <td> 0.16 : 100 </td>
76
+ <td> 0.165 </td>
77
+ </tr>
78
+ <tr>
79
+ <td> Level 4 </td>
80
+ <td> 0.18 : 283 </td>
81
+ <td> 0.024 : 248 </td>
82
+ <td> 0.22 : 191 </td>
83
+ <td> 0.009 : 114 </td>
84
+ <td> 0.169 : 142 </td>
85
+ <td> 0.064 : 125 </td>
86
+ <td> 0.09 : 111 </td>
87
+ <td> 0.108 </td>
88
+ </tr>
89
+ <tr>
90
+ <td> Level 5 </td>
91
+ <td> 0.088 : 307 </td>
92
+ <td> 0.004 : 280 </td>
93
+ <td> 0.104 : 193 </td>
94
+ <td> 0.0 : 135 </td>
95
+ <td> 0.136 : 154 </td>
96
+ <td> 0.023 : 132 </td>
97
+ <td> 0.065 : 123 </td>
98
+ <td> 0.06 </td>
99
+ </tr>
100
+ <tr>
101
+ <td> Average </td>
102
+ <td> 0.285 </td>
103
+ <td> 0.075 </td>
104
+ <td> 0.295 </td>
105
+ <td> 0.056 </td>
106
+ <td> 0.283 </td>
107
+ <td> 0.114 </td>
108
+ <td> 0.184 </td>
109
+ <td> 0.166 </td>
110
+ </tr>
111
+ </table>