yuhuixu commited on
Commit
abc5345
·
verified ·
1 Parent(s): 36fde32

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+
3
+ # Elastic Reasoning
4
+ <div>
5
+ <div>
6
+ <h3>🚀 Scalable Chain of Thoughts via Elastic Reasoning 🌟
7
+ </div>
8
+ </div>
9
+ <br>
10
+ <div align="center">
11
+
12
+ [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/pdf/2505.05315)
13
+ [![Hugging Face Collection](https://img.shields.io/badge/E1-fcd022?style=for-the-badge&logo=huggingface&logoColor=000&labelColor)](https://huggingface.co/collections/Salesforce/elastic-reasoning-682b4bba108d6ea0a8bab275)
14
+ [![Github](https://img.shields.io/badge/Elastic_Reasoning-000000?style=for-the-badge&logo=github&logoColor=000&logoColor=white)](https://github.com/SalesforceAIResearch/Elastic-Reasoning)
15
+
16
+ </div>
17
+ </div>
18
+
19
+
20
+ ## Introduction
21
+ We propose **Elastic Reasoning**, a novel framework for scalable chain of thoughts
22
+ that explicitly separates reasoning into two phases—`thinking and solution`—with
23
+ independently allocated budgets. At test time, Elastic Reasoning prioritize that
24
+ completeness of solution segments, significantly improving reliability under tight
25
+ resource constraints. To train models that are robust to truncated thinking, we
26
+ introduce a lightweight `budget-constrained rollout` strategy, integrated into GRPO,
27
+ which teaches the model to reason adaptively when the thinking process is cut
28
+ short and generalizes effectively to unseen budget constraints without additional
29
+ training.
30
+ <p align="center">
31
+ <img src="figs/framework.png" width="80%" />
32
+ </p>
33
+
34
+
35
+ **Main Takeaways**
36
+ 1. ✂️ Thinking + Solution are explicitly separated with independent budgets — boosting reliability under tight compute constraints.
37
+ 2. 🧠 Budget-Constrained Rollout: We train models to handle truncated reasoning using GRPO.
38
+ 3. 📈 Flexible scalability: Robust performance across diverse inference budgets on reasoning benchmarks like AIME and LiveCodeBench.
39
+ 4. ⚙️ Better performance with fewer tokens: Our trained model generates outputs that are 30% shorter while maintaining (or even improving) accuracy.
40
+
41
+ <p align="center">
42
+ <img src="figs/aime.png" width="46%" />
43
+ <img src="figs/livecode.png" width="48%" />
44
+ </p>
45
+
46
+ <p align="center">
47
+ <img src="figs/codetable.png" width="90%" />
48
+ </p>
49
+
50
+
51
+ ## Citation
52
+
53
+
54
+ ```bibtex
55
+ @article{xu2025scalable,
56
+ title={Scalable Chain of Thoughts via Elastic Reasoning},
57
+ author={Xu, Yuhui and Dong, Hanze and Wang, Lei and Sahoo, Doyen and Li, Junnan and Xiong, Caiming},
58
+ journal={arXiv preprint arXiv:2505.05315},
59
+ year={2025}
60
+ }
61
+ ```
62
+
63
+ ## Ethical Considerations
64
+ This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high-risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our AUP and AI AUP.