LiangJiang commited on
Commit
0a5c26e
·
verified ·
1 Parent(s): a13ffaa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -22,7 +22,7 @@ Ring-1T adopts the Ling 2.0 architecture and is trained on the Ling-1T-base foun
22
  To evaluate the deep reasoning capabilities of Ring-1T, we selected representative open-source reasoning models (Ring-1T-preview, Deepseek-V3.1-Terminus-Thinking, Qwen-235B-A22B-Thinking-2507) and closed-source APIs (Gemini-2.5-pro and GPT-5-Thinking(High)) as benchmarks. First, compared to the previously open-sourced preview version, Ring-1T demonstrates more balanced performance across various tasks. Furthermore, Ring-1T achieves open-source leading performance on challenging reasoning benchmarks such as math competitions (AIME 25, HMMT 25), code generation (LiveCodeBench, CodeForce), and logical reasoning (ARC-AGI-1). It also exhibits strong competitiveness in comprehensive tasks (Arena-Hard-v2.0), healthcare (HealthBench), and creative writing (Creative Writing v3).
23
 
24
  <p align="center">
25
- <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/5TBESJNjsbAAAAAAYYAAAAgADod9AQFr/original" width="100"/>
26
  </p>
27
 
28
  Although we have implemented string-level and semantic-level contamination filtering for benchmark tasks across all training stages—including pre-training, fine-tuning instructions, and reinforcement learning prompts—rigorous decontamination for earlier published benchmarks remains a significant challenge in the industry. To more objectively analyze Ring-1T's deep reasoning capabilities, we conducted tests using the IMO 2025 (International Mathematical Olympiad) held in July this year and the recently concluded ICPC World Finals 2025 (International Collegiate Programming Contest World Finals).
 
22
  To evaluate the deep reasoning capabilities of Ring-1T, we selected representative open-source reasoning models (Ring-1T-preview, Deepseek-V3.1-Terminus-Thinking, Qwen-235B-A22B-Thinking-2507) and closed-source APIs (Gemini-2.5-pro and GPT-5-Thinking(High)) as benchmarks. First, compared to the previously open-sourced preview version, Ring-1T demonstrates more balanced performance across various tasks. Furthermore, Ring-1T achieves open-source leading performance on challenging reasoning benchmarks such as math competitions (AIME 25, HMMT 25), code generation (LiveCodeBench, CodeForce), and logical reasoning (ARC-AGI-1). It also exhibits strong competitiveness in comprehensive tasks (Arena-Hard-v2.0), healthcare (HealthBench), and creative writing (Creative Writing v3).
23
 
24
  <p align="center">
25
+ <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/5TBESJNjsbAAAAAAYYAAAAgADod9AQFr/original" />
26
  </p>
27
 
28
  Although we have implemented string-level and semantic-level contamination filtering for benchmark tasks across all training stages—including pre-training, fine-tuning instructions, and reinforcement learning prompts—rigorous decontamination for earlier published benchmarks remains a significant challenge in the industry. To more objectively analyze Ring-1T's deep reasoning capabilities, we conducted tests using the IMO 2025 (International Mathematical Olympiad) held in July this year and the recently concluded ICPC World Finals 2025 (International Collegiate Programming Contest World Finals).