Gen-Verse
/

ReasonFlux-F1

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Lingaaaaaaa commited on Mar 21, 2025

Commit

0cb33c5

·

verified ·

1 Parent(s): 8f51f2f

Update README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -35,13 +35,12 @@ Revolutionary template-augmented reasoning paradigm enpowers a 32B model to outp
 We present the evaluation results of our ReasonFlux-F1-32B on challenging reasoning tasks including AIME2024,AIM2025,MATH500 and GPQA-Diamond. To make a fair comparison, we report the results of the LLMs on our evaluation scripts in [ReasonFlux-F1]().
 | Model                                   | AIME2024@pass1 | AIME2025@pass1 | MATH500@pass1 | GPQA@pass1 |
-| --------------------------------------- | -------------- | -------------- | ------------- | ---------- |
 | QwQ-32B-Preview                         | 46.7           | 37.2           | 90.6          | 65.2       |
-| LIMO-32B                                | 56.3           | 44.5           | 94.80         | 58.08      |
 | s1-32B                                  | 56.7           | 49.3           | 93.0          | 59.6       |
-| OpenThinker-32B                         | 66.0           | 53.3           | 94.8          | 60.10      |
-| FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview | 76.67          | 40.0           | 93.4          | 59.09      |
-| R1-Distill-32B                          | 70             | 46.67          | 92            | 59.59      |
 | ReasonFlux-Zero-32B                     | 56.7           | 37.2           | 91.2          | 61.2       |
 | **ReasonFlux-F1-32B**                   | **76.7**      | **53.3**      | **96.0**      | **67.2**  |

 We present the evaluation results of our ReasonFlux-F1-32B on challenging reasoning tasks including AIME2024,AIM2025,MATH500 and GPQA-Diamond. To make a fair comparison, we report the results of the LLMs on our evaluation scripts in [ReasonFlux-F1]().
 | Model                                   | AIME2024@pass1 | AIME2025@pass1 | MATH500@pass1 | GPQA@pass1 |
+| --------------------------------------- | :--------------: | :--------------: | :-------------: | :----------: |
 | QwQ-32B-Preview                         | 46.7           | 37.2           | 90.6          | 65.2       |
+| LIMO-32B                                | 56.3           | 44.5           | 94.8         | 58.1      |
 | s1-32B                                  | 56.7           | 49.3           | 93.0          | 59.6       |
+| OpenThinker-32B                         | 66.0           | 53.3           | 94.8          | 60.1      |
+| R1-Distill-32B                          | 70.0             | 46.7          | 92.0            | 59.6      |
 | ReasonFlux-Zero-32B                     | 56.7           | 37.2           | 91.2          | 61.2       |
 | **ReasonFlux-F1-32B**                   | **76.7**      | **53.3**      | **96.0**      | **67.2**  |