Update README.md
Browse files
README.md
CHANGED
|
@@ -24,11 +24,17 @@ Trained on __20T+ tokens of high-quality data__, together with __supervised fine
|
|
| 24 |
### Powerful Complex Reasoning Abilities
|
| 25 |
|
| 26 |
We conducted a comprehensive evaluation of Ling-flash-2.0’s reasoning capabilities, reporting strong results on representative benchmarks:
|
|
|
|
| 27 |
● __Multi-disciplinary knowledge reasoning__: GPQA-Diamond, MMLU-Pro
|
|
|
|
| 28 |
● __Advanced mathematical reasoning__: AIME 2025, Omni-MATH, OptMATH (advanced mathematical optimization tasks)
|
|
|
|
| 29 |
● __Challenging code generation__: LiveCodeBench v6, CodeForces-Elo
|
|
|
|
| 30 |
● __Logical reasoning__: KOR-Bench, ARC-Prize
|
|
|
|
| 31 |
● __Key regulated industries (Finance, Healthcare)__: FinanceReasoning, HealthBench
|
|
|
|
| 32 |
Compared with __dense models under 40B__ (e.g., Qwen3-32B-Non-Thinking, Seed-OSS-36B-Instruct (think budget=0)) and __larger-activation/total-parameter MoE models__ (e.g., Hunyuan-A13B-Instruct, GPT-OSS-120B/low), __Ling-flash-2.0__ demonstrates stronger complex reasoning power. Moreover, it shows high competitiveness on __creative tasks__ (Creative Writing v3).
|
| 33 |
<p align="center">
|
| 34 |
<img src="https://mdn.alipayobjects.com/huamei_fi95qp/afts/img/zxAvQ7QtrAwAAAAAQqAAAAgADkZ7AQFr/fmt.webp"/>
|
|
|
|
| 24 |
### Powerful Complex Reasoning Abilities
|
| 25 |
|
| 26 |
We conducted a comprehensive evaluation of Ling-flash-2.0’s reasoning capabilities, reporting strong results on representative benchmarks:
|
| 27 |
+
|
| 28 |
● __Multi-disciplinary knowledge reasoning__: GPQA-Diamond, MMLU-Pro
|
| 29 |
+
|
| 30 |
● __Advanced mathematical reasoning__: AIME 2025, Omni-MATH, OptMATH (advanced mathematical optimization tasks)
|
| 31 |
+
|
| 32 |
● __Challenging code generation__: LiveCodeBench v6, CodeForces-Elo
|
| 33 |
+
|
| 34 |
● __Logical reasoning__: KOR-Bench, ARC-Prize
|
| 35 |
+
|
| 36 |
● __Key regulated industries (Finance, Healthcare)__: FinanceReasoning, HealthBench
|
| 37 |
+
|
| 38 |
Compared with __dense models under 40B__ (e.g., Qwen3-32B-Non-Thinking, Seed-OSS-36B-Instruct (think budget=0)) and __larger-activation/total-parameter MoE models__ (e.g., Hunyuan-A13B-Instruct, GPT-OSS-120B/low), __Ling-flash-2.0__ demonstrates stronger complex reasoning power. Moreover, it shows high competitiveness on __creative tasks__ (Creative Writing v3).
|
| 39 |
<p align="center">
|
| 40 |
<img src="https://mdn.alipayobjects.com/huamei_fi95qp/afts/img/zxAvQ7QtrAwAAAAAQqAAAAgADkZ7AQFr/fmt.webp"/>
|