Kwai-Klear
/

Klear-46B-A2.5B-Base

@@ -105,28 +105,28 @@ Note:
 | ------------------------- | --------------------------- | --------------- | --------------------- | ----------- | ------------------ | ------------- | -------- | ------------------ |
 |                           | # Total Params              | 46B             | 8B                    | 8B          | 8B                 | 12B           | 14B      | 30B                |
 |                           | # Activated Params          | 2.5B            | 8B                    | 8B          | 8B                 | 12B           | 14B      | 3B                 |
-| **English** | MMLU-Redux                  | 82.16           | 74.65                 | 77.63       | 79.32              | 78.39         | 83.09    | 88.11              |
-|                           | MMLU-Pro                    | 63.86           | 50.87                 | 54.69       | 63.8               | 60.69         | 67.25    | 78.22              |
-|                           | GPQA-Diamoind               | 49.24           | 38.76                 | 38.51       | 51.77              | 39.02         | 59.47    | 71.21              |
-|                           | SimpleQA                    | 6.52            | 4.44                  | 3.51        | 5.5                | 6.22          | 3.28     | 23.39              |
-| **Chinese** | CLUEWSC                     | 88.16           | 77.63                 | 81.91       | 82.89              | 91.12         | 88.16    | 92.11              |
-|                           | CEval                       | 83.99           | 84.26                 | 81.78       | 81.66              | 60.81         | 64.79    | 88.57              |
-|                           | C-SimpleQA                  | 42.3            | 25.87                 | 23.13       | 37.07              | 28.97         | 24.77    | 75.37              |
-| **Math & Reasoning**      | MATH500                     | 82.8            | 68.4                  | 79.8        | 85                 | 86.8          | 80.6     | 97.2               |
-|                           | AIME24                      | 25.62           | 11.25                 | 22.92       | 28.33              | 23.96         | 15.83    | 75                 |
-|                           | AIME25                      | 18.12           | 8.12                  | 15.21       | 20.62              | 18.33         | 18.75    | 61.88              |
-| **Code**                  | HumanEval                   | 87.8            | 82.3*                 | 74.39       | 83.54              | 82.32         | 85.37    | 81.71              |
-|                           | HumanEval+                  | 81.1            | -                     | 70.12       | 76.83              | 75.61         | 83.54    | 76.83              |
-|                           | MBPPEvalplus                | 83.1            | 62.4                  | 82          | 76.2               | 85.7          | 77.5     | 89.4               |
-|                           | MBPPEvalplus++              | 70.4            | 50.4                  | 69.3        | 66.1               | 74.1          | 66.7     | 75.1               |
-|                           | LiveCodeBench v5(2408-2501) | 28.67           | 14.7                  | 12.19       | 27.24              | 24.73         | 23.66    | 41.22              |
-| **Instruction Following** | IF-Eval                     | 80.04           | 79.3                  | 73.01       | 84.47              | 81.52         | 59.33    | 83.92              |
-|                           | Multi-IF(en+zh)             | 78.73           | 62.53                 | 61.79       | 78.95              | 76.56         | 62.7     | 77.75              |
-| **Comprehensive Ability** | MTBench                     | 8.23            | 7.86                  | 6.875       | 8.21               | 8.675         | 8.625    | 9.33               |
-|                           | MT-Eval                     | 8.11            | 7.36                  | 6.7         | 8.18               | 8.45          | 8.12     | -                  |
-|                           | AlignBench v1.1             | 6.85            | 6.13                  | 5.99        | 6.95               | 6.3           | 6.33     | 7.06               |
-|                           | LiveBench 1125              | 50.1            | 26.3                  | 25.5        | 52.1               | 43.1          | 40       | 68.4               |
-|                           | Average                     | 53.61           | -                     | 46.05       | 52.61              | 50.54         | 48.95    | -                  |
 Note:
 1. For InternLM3-8B-Instruct, the results marked with `*` are sourced from their public report, other evaluations are conducted based on internal evaluation frameworks.

 | ------------------------- | --------------------------- | --------------- | --------------------- | ----------- | ------------------ | ------------- | -------- | ------------------ |
 |                           | # Total Params              | 46B             | 8B                    | 8B          | 8B                 | 12B           | 14B      | 30B                |
 |                           | # Activated Params          | 2.5B            | 8B                    | 8B          | 8B                 | 12B           | 14B      | 3B                 |
+| **English Understanding** | MMLU-Redux                  | 81.61 | 74.65 | 77.63 | 79.32 | 78.39 | 83.09 | 88.11 |
+|                           | MMLU-Pro                    | 63.47 | 50.87 | 54.69 | 63.8  | 60.69 | 67.25 | 78.22 |
+|                           | GPQA-Diamoind               | 47.85 | 38.76 | 38.51 | 51.77 | 39.02 | 59.47 | 71.21 |
+|                           | SimpleQA                    | 6.52  | 4.44  | 3.51  | 5.5   | 6.22  | 3.28  | 23.39 |
+| **Chinese Understanding** | CLUEWSC                     | 88.16 | 77.63 | 81.91 | 82.89 | 91.12 | 88.16 | 92.11 |
+|                           | CEval                       | 83.99 | 84.26 | 81.78 | 81.66 | 60.81 | 64.79 | 88.57 |
+|                           | C-SimpleQA                  | 42.3  | 25.87 | 23.13 | 37.07 | 28.97 | 24.77 | 75.37 |
+| **Math & Reasoning**      | MATH500                     | 82.8  | 68.4  | 79.8  | 85    | 86.8  | 80.6  | 97.2  |
+|                           | AIME24                      | 25.62 | 11.25 | 22.92 | 28.33 | 23.96 | 15.83 | 75    |
+|                           | AIME25                      | 18.12 | 8.12  | 15.21 | 20.62 | 18.33 | 18.75 | 61.88 |
+| **Code**                  | HumanEval                   | 87.8  | 82.3* | 74.39 | 83.54 | 82.32 | 85.37 | 81.71 |
+|                           | HumanEval+                  | 81.1  | -     | 70.12 | 76.83 | 75.61 | 83.54 | 76.83 |
+|                           | MBPPEvalplus                | 83.1  | 62.4  | 82    | 76.2  | 85.7  | 77.5  | 89.4  |
+|                           | MBPPEvalplus++              | 70.4  | 50.4  | 69.3  | 66.1  | 74.1  | 66.7  | 75.1  |
+|                           | LiveCodeBench v5(2408-2501) | 28.67 | 14.7  | 12.19 | 27.24 | 24.73 | 23.66 | 41.22 |
+| **Instruction Following** | IF-Eval                     | 80.04 | 79.3  | 73.01 | 84.47 | 81.52 | 59.33 | 83.92 |
+|                           | Multi-IF(en+zh)             | 78.73 | 62.53 | 61.79 | 78.95 | 76.56 | 62.7  | 77.75 |
+| **Comprehensive Ability** | MTBench                     | 8.23  | 7.86  | 6.875 | 8.21  | 8.675 | 8.625 | 9.33  |
+|                           | MT-Eval                     | 8.11  | 7.36  | 6.7   | 8.18  | 8.45  | 8.12  | -     |
+|                           | AlignBench v1.1             | 6.85  | 6.13  | 5.99  | 6.95  | 6.3   | 6.33  | 7.06  |
+|                           | LiveBench 1125              | 50.1  | 26.3  | 25.5  | 52.1  | 43.1  | 40    | 68.4  |
+|                           | Average                     | 53.50 | -     | 46.05 | 52.61 | 50.54 | 48.95 | -     |
 Note:
 1. For InternLM3-8B-Instruct, the results marked with `*` are sourced from their public report, other evaluations are conducted based on internal evaluation frameworks.