A collection of mutiple benchmarks for large reasoning model evaluation
datasets-and-models
non-profit
AI & ML interests
None defined yet.
Recent Activity
View all activity
models
39
guanning-ai/SmolLM-360M-RLOO-Math-Step1100
Updated
guanning-ai/SmolLM-360M-GRPO-Math-Step1100
Updated
guanning-ai/20260102-p_normalization_step4000
0.4B
•
Updated
•
122
guanning-ai/20260102-grpo_step4000
0.4B
•
Updated
•
90
guanning-ai/smollm-gsm8k-pnorm-ckpt4900
0.4B
•
Updated
•
11
guanning-ai/smollm-gsm8k-grpo-ckpt3900
0.4B
•
Updated
•
7
guanning-ai/smollm-gsm8k-grpo-ckpt1000
0.4B
•
Updated
•
202
guanning-ai/maze_sft_weights_1207
Updated
guanning-ai/Gai
Updated
guanning-ai/1027-math4b-bz1024-pposz128-rollout4-seed20
Updated
datasets
138
guanning-ai/gsm8k-platinum
Viewer
•
Updated
•
1.21k
•
9
guanning-ai/math500_level5
Viewer
•
Updated
•
134
•
22
guanning-ai/math500_level4
Viewer
•
Updated
•
128
•
19
guanning-ai/math500_level3
Viewer
•
Updated
•
105
•
19
guanning-ai/math500_level2
Viewer
•
Updated
•
90
•
23
guanning-ai/math500_level1
Viewer
•
Updated
•
43
•
20
guanning-ai/minervamath
Viewer
•
Updated
•
272
•
10
guanning-ai/smollm-gsm8k-data-1024
Viewer
•
Updated
•
7.65M
•
71
guanning-ai/gsm8k-metamath
Viewer
•
Updated
•
160k
•
28
guanning-ai/gsm8k-mumath
Viewer
•
Updated
•
92k
•
21