inclusionAI
/

Ring-2.5-1T

Text Generation

compressed-tensors

Model card Files Files and versions

LiangJiang commited on about 18 hours ago

Commit

b3087bf

·

verified ·

1 Parent(s): 7a791ea

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -40,7 +40,7 @@ Note: If you are interested in the previous version, please visit the past model
 ## Deep Thinking & Long-horizon task Execution
 <p align="center">
-    <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/HAkCTKAY7akAAAAAVdAAAAgADod9AQFr/original />
 </p>
 For evaluating the Deep Thinking and Long-term Execution capabilities of Ring-2.5-1T, we selected representative open-source thinking models (DeepSeek-v3.2-Thinking, Kimi-K2.5-Thinking) and closed-source APIs (GPT-5.2-thinking-high, Gemini-3.0-Pro-preview-thinking-high, Claude-Opus-4.5-Extended-Thinking) as references. Ring-2.5-1T achieves state-of-the-art open-source performance across both high-difficulty reasoning tasks—including mathematics, coding, and logical reasoning (IMOAnswerBench, AIME 26, HMMT 25, LiveCodeBench, ARC-AGI-V2)—and long-horizon task execution such as agent search, tool calling, and software engineering  (Gaia2-search, Tau2-bench, and SWE-Bench Verified).

 ## Deep Thinking & Long-horizon task Execution
 <p align="center">
+    <img src="https://mdn.alipayobjects.com/huamei_d2byvp/afts/img/HAkCTKAY7akAAAAAVdAAAAgADod9AQFr/original" />
 </p>
 For evaluating the Deep Thinking and Long-term Execution capabilities of Ring-2.5-1T, we selected representative open-source thinking models (DeepSeek-v3.2-Thinking, Kimi-K2.5-Thinking) and closed-source APIs (GPT-5.2-thinking-high, Gemini-3.0-Pro-preview-thinking-high, Claude-Opus-4.5-Extended-Thinking) as references. Ring-2.5-1T achieves state-of-the-art open-source performance across both high-difficulty reasoning tasks—including mathematics, coding, and logical reasoning (IMOAnswerBench, AIME 26, HMMT 25, LiveCodeBench, ARC-AGI-V2)—and long-horizon task execution such as agent search, tool calling, and software engineering  (Gaia2-search, Tau2-bench, and SWE-Bench Verified).