ARES-RL-7B / README.md

ares0728

Create README.md

6d486f5 verified 4 months ago

preview code

raw

history blame contribute delete

1.36 kB

🌟 ARES — Adaptive Multimodal Reasoning Framework
Two-stage adaptive reasoning: cold-start + entropy-shaped RL.

🔑 Highlights
Balanced reasoning across easy & hard tasks via token-level entropy shaping.
SOTA efficiency–accuracy tradeoffs on diverse multimodal and textual benchmarks.

📚 Training Pipeline

Adaptive Cold-Start — curate difficulty-aware reasoning traces
Entropy-Shaped RL (AEPO) — trigger exploration via high-window entropy, hierarchical rewards

📂 Resources

Paper: ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping :contentReference[oaicite:0]{index=0}
Code: GitHub – shawn0728/ARES :contentReference[oaicite:1]{index=1}

📌 Citation

@misc{chen2025aresmultimodaladaptivereasoning,
      title={ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping}, 
      author={Shuang Chen and Yue Guo and Yimeng Ye and Shijue Huang and Wenbo Hu and Haoxi Li and Manyuan Zhang and Jiayu Chen and Song Guo and Nanyun Peng},
      year={2025},
      eprint={2510.08457},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.08457}, 
}

Give ARES a shot and tell us what reasoning challenges it helps you solve! 🚀