File size: 3,396 Bytes
c761ff2
 
 
 
 
 
 
 
 
 
 
 
 
 
8c682f4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa63cb1
8c682f4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa63cb1
 
 
 
c761ff2
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
language:
- en
license: mit
datasets:
- fdyrd/MATH
base_model:
- Qwen/Qwen2.5-0.5B
library_name: transformers
tags:
- text-generation-inference
metrics:
- accuracy
---

# QwenMath

A generation LLM which can solve math problems.

## Training Statistics
```yaml
training-method: lora
training-time: "5:42"
data-size: 500
epoch: 3
total_flos: "1372250GF"
train_loss: 0.6441
train_samples_per_second: 4.385
train_steps_per_second: 0.544
```

## Validation Set Performance
Dataset used: test split of [fdyrd/MATH](https://huggingface.co/datasets/fdyrd/MATH).
Metric: accuracy

<table>
        <tr>
                <th> Level </th>
                <th> Algebra </th>
                <th> Intermediate Algebra </th>
                <th> Prealgebra </th>
                <th> Precalculus </th>
                <th> Number Theory </th>
                <th> Geometry </th>
                <th> Counting & Probability </th>
                <th> Average </th>
        </tr>
        <tr>
                <td> Level 1 </td>
                <td> 0.541 : 135 </td>
                <td> 0.192 : 52 </td>
                <td> 0.477 : 86 </td>
                <td> 0.228 : 57 </td>
                <td> 0.467 : 30 </td>
                <td> 0.263 : 38 </td>
                <td> 0.359 : 39 </td>
                <td> 0.361 </td>
        </tr>
        <tr>
                <td> Level 2 </td>
                <td> 0.323 : 201 </td>
                <td> 0.109 : 128 </td>
                <td> 0.367 : 177 </td>
                <td> 0.044 : 113 </td>
                <td> 0.38 : 92 </td>
                <td> 0.134 : 82 </td>
                <td> 0.248 : 101 </td>
                <td> 0.229 </td>
        </tr>
        <tr>
                <td> Level 3 </td>
                <td> 0.291 : 261 </td>
                <td> 0.046 : 195 </td>
                <td> 0.308 : 224 </td>
                <td> 0.0 : 127 </td>
                <td> 0.262 : 122 </td>
                <td> 0.088 : 102 </td>
                <td> 0.16 : 100 </td>
                <td> 0.165 </td>
        </tr>
        <tr>
                <td> Level 4 </td>
                <td> 0.18 : 283 </td>
                <td> 0.024 : 248 </td>
                <td> 0.22 : 191 </td>
                <td> 0.009 : 114 </td>
                <td> 0.169 : 142 </td>
                <td> 0.064 : 125 </td>
                <td> 0.09 : 111 </td>
                <td> 0.108 </td>
        </tr>
        <tr>
                <td> Level 5 </td>
                <td> 0.088 : 307 </td>
                <td> 0.004 : 280 </td>
                <td> 0.104 : 193 </td>
                <td> 0.0 : 135 </td>
                <td> 0.136 : 154 </td>
                <td> 0.023 : 132 </td>
                <td> 0.065 : 123 </td>
                <td> 0.06 </td>
        </tr>
        <tr>
                <td> Average </td>
                <td> 0.285 </td>
                <td> 0.075 </td>
                <td> 0.295 </td>
                <td> 0.056 </td>
                <td> 0.283 </td>
                <td> 0.114 </td>
                <td> 0.184 </td>
                <td> 0.166 </td>
        </tr>
</table>

## Test Set Performance

```json
[
  {
    "dataset": "MATH500",
    "url": "https://huggingface.co/datasets/qq8933/MATH500",
    "accuracy": 0.286
  },
  {
    "dataset": "GSM8K",
    "url": "https://huggingface.co/datasets/openai/gsm8k",
    "accuracy": 0.382
  }
]
```