Intelligent-Internet
/

II-Thought-1.5B-Preview

Model card Files Files and versions

Phu Nguyen commited on Mar 25, 2025

Commit

53d23f9

·

verified ·

1 Parent(s): 54cc0b9

Update README.md

Files changed (1) hide show

README.md +13 -13

README.md CHANGED Viewed

@@ -36,19 +36,19 @@ Sampling Configs:
 Additionally, for Live-Code-Bench, we leverage [QWQ-Evaluation](https://github.com/QwenLM/QwQ/tree/main/eval) to reproduce results using a max context length of 32768, averaging over 8 runs.
-| Benchmark | DeepSeek-R1-Distill-Qwen-1.5B | II-Thought-1.5B-Preview |
-|-----------|-------------------------------|--------------------------|
-| AMC23 | 68.48 | **79.41** |
-| AIME24 | 28.07 | **33.39** |
-| AIME25 | 22.6 | **25.68** |
-| Olympiad Bench | 42.04 | **51.63** |
-| Math500 | 82.3 | **86.8** |
-| Math Gakao 2023 English | 72.18 | **76.85** |
-| Minerva Math | 27.62 | **31.89** |
-| Vietnamese Entrance Math Exam | 39.85 | **45.12** |
-| LiveCodeBench | 16.66 | **19.84** |
-| IFEval | 41.95 | **45.56** |
-| **Average** | 44.175 | **49.61** |
 ## How To Use
 Our model can be utilized in the same manner as Qwen or Deepseek-R1-Distill models.

 Additionally, for Live-Code-Bench, we leverage [QWQ-Evaluation](https://github.com/QwenLM/QwQ/tree/main/eval) to reproduce results using a max context length of 32768, averaging over 8 runs.
+| Benchmark                               | DeepSeek-R1-Distill-Qwen-1.5B | Qwen2.5-Math-1.5B-Instruct | II-Thought-1.5B-Preview |
+|-----------------------------------------|------------------------------|---------------------------|-------------------------|
+| **AMC23**                               | 69.69                        | 54.26                     | **79.77**                   |
+| **AIME24**                              | 29.43                        | 10.73                     | **34.17**                   |
+| **AIME25**                              | 23.39                        | 8.8                       | **26.09**                   |
+| **Olympiad Bench**                      | 43.15                        | 36.07                     | **52.78**                   |
+| **Math500**                             | 83.6                         | 73.15                     | **87.2**                    |
+| **Math Gaokao 2023 English**            | 72.99                        | 62.47                     | **77.21**                   |
+| **Minerva Math**                        | 27.57                        | 24.45                     | **30.79**                   |
+| **Vietnamese Entrance Math Exam**       | 40.32                        | 26.69                     | **46.24**                   |
+| **LiveCodeBench**                       | 16.66                        | 2.6                       | **19.84**                  |
+| **IFEval**                              | 44.24                        | 27.22                     | **44.84**                  |
+| **Average**                             | 45.10                        | 32.64                     | **49.90**                   |
 ## How To Use
 Our model can be utilized in the same manner as Qwen or Deepseek-R1-Distill models.