inclusionAI
/

Ring-1T

@@ -34,7 +34,7 @@ Note: If you are interested in previous version, please visit the past model col
 ## Continuously Evolving Deep Reasoning Capabilities
-To evaluate the deep reasoning capabilities of Ring-1T, we selected representative open-source reasoning models (Ring-1T-preview, Deepseek-V3.1-Terminus-Thinking, Qwen-235B-A22B-Thinking-2507) and closed-source APIs (Gemini-2.5-pro and GPT-5-Thinking(High)) as benchmarks. First, compared to the previously open-sourced preview version, Ring-1T demonstrates more balanced performance across various tasks. Furthermore, Ring-1T achieves open-source leading performance on challenging reasoning benchmarks such as math competitions (AIME 25, HMMT 25), code generation (LiveCodeBench, CodeForce), and logical reasoning (ARC-AGI-1). It also exhibits strong competitiveness in comprehensive tasks (Arena-Hard-v2.0), healthcare (HealthBench), and creative writing (Creative Writing v3).
 <p align="center">
     <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/5TBESJNjsbAAAAAAYYAAAAgADod9AQFr/original" />
@@ -42,7 +42,7 @@ To evaluate the deep reasoning capabilities of Ring-1T, we selected representati
 Although we have implemented string-level and semantic-level contamination filtering for benchmark tasks across all training stages—including pre-training, fine-tuning instructions, and reinforcement learning prompts—rigorous decontamination for earlier published benchmarks remains a significant challenge in the industry. To more objectively analyze Ring-1T's deep reasoning capabilities, we conducted tests using the IMO 2025 (International Mathematical Olympiad) held in July this year and the recently concluded ICPC World Finals 2025 (International Collegiate Programming Contest World Finals).
-For the IMO 2025 test, similar to the previous preview version, we integrated Ring-1T into the multi-agent framework AWorld (https://github.com/inclusionAI/AWorld) and used pure natural language reasoning to solve the problems. The results show that Ring-1T solved Problems 1, 3, 4, and 5 in a single attempt (silver medal level at IMO). On the third attempt, it also produced a nearly perfect proof for Problem 2, a geometry proof. For the most challenging Problem 6 (which no AI contestant in IMO 2025 solved correctly), Ring-1T converged to the same answer as Gemini 2.5 Pro—"4048" (the correct answer is 2112). We believe that with ongoing optimizations, Ring-1T has the potential to reach gold medal level at IMO in a single attempt in the future.
 <p align="center">
     <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/mnRJTa5a00gAAAAAQ2AAAAgADod9AQFr/original" width="500"/>
@@ -110,7 +110,6 @@ Ring-1T@Aworld IMO test trajectory: [https://github.com/inclusionAI/AWorld/tree/
 ### 🚀 Try Online
-**TODO**
 You can experience Ring-1T online at: [ZenMux](https://zenmux.ai/inclusionai/ring-1t?utm_source=hf_inclusionAI)
 ### 🔌 API Usage

 ## Continuously Evolving Deep Reasoning Capabilities
+To evaluate the deep reasoning capabilities of Ring-1T, we selected representative open-source reasoning models (Ring-1T-preview, Deepseek-V3.1-Terminus-Thinking, Qwen-235B-A22B-Thinking-2507) and closed-source APIs (Gemini-2.5-pro and GPT-5-Thinking(High)) as benchmarks. First, compared to the previously open-sourced preview version, Ring-1T demonstrates more balanced performance across various tasks. Furthermore, Ring-1T achieves open-source leading performance on challenging reasoning benchmarks such as **math competitions** (AIME 25, HMMT 25), **code generation** (LiveCodeBench, CodeForce), and **logical reasoning** (ARC-AGI-1). It also exhibits strong competitiveness in **comprehensive tasks** (Arena-Hard-v2.0), **healthcare** (HealthBench), and **creative writing** (Creative Writing v3).
 <p align="center">
     <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/5TBESJNjsbAAAAAAYYAAAAgADod9AQFr/original" />
 Although we have implemented string-level and semantic-level contamination filtering for benchmark tasks across all training stages—including pre-training, fine-tuning instructions, and reinforcement learning prompts—rigorous decontamination for earlier published benchmarks remains a significant challenge in the industry. To more objectively analyze Ring-1T's deep reasoning capabilities, we conducted tests using the IMO 2025 (International Mathematical Olympiad) held in July this year and the recently concluded ICPC World Finals 2025 (International Collegiate Programming Contest World Finals).
+For the **IMO 2025** test, similar to the previous preview version, we integrated Ring-1T into the multi-agent framework AWorld (https://github.com/inclusionAI/AWorld) and used pure natural language reasoning to solve the problems. The results show that Ring-1T solved Problems 1, 3, 4, and 5 in a single attempt (silver medal level at IMO). On the third attempt, it also produced a nearly perfect proof for Problem 2, a geometry proof. For the most challenging Problem 6 (which no AI contestant in IMO 2025 solved correctly), Ring-1T converged to the same answer as Gemini 2.5 Pro—"4048" (the correct answer is 2112). We believe that with ongoing optimizations, Ring-1T has the potential to reach gold medal level at IMO in a single attempt in the future.
 <p align="center">
     <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/mnRJTa5a00gAAAAAQ2AAAAgADod9AQFr/original" width="500"/>
 ### 🚀 Try Online
 You can experience Ring-1T online at: [ZenMux](https://zenmux.ai/inclusionai/ring-1t?utm_source=hf_inclusionAI)
 ### 🔌 API Usage