R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging Paper • 2602.06763 • Published Feb 6
Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation Paper • 2512.14048 • Published Dec 16, 2025
3ViewSense: Spatial and Mental Perspective Reasoning from Orthographic Views in Vision-Language Models Paper • 2603.07751 • Published 17 days ago • 12
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models Paper • 2407.17467 • Published Jul 24, 2024
SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding Paper • 2408.14764 • Published Aug 27, 2024
Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model Paper • 2404.10306 • Published Apr 16, 2024 • 1
Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning Paper • 2507.20335 • Published Jul 27, 2025
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy Paper • 2508.05592 • Published Aug 7, 2025 • 6
ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning Paper • 2508.19996 • Published Aug 27, 2025
RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization Paper • 2508.09459 • Published Aug 13, 2025 • 2
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy Paper • 2508.05592 • Published Aug 7, 2025 • 6
LexSemBridge: Fine-Grained Dense Representation Enhancement through Token-Aware Embedding Augmentation Paper • 2508.17858 • Published Aug 25, 2025 • 10