TIP: Token Importance in On-Policy Distillation Paper • 2604.14084 • Published 24 days ago • 14
Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision Paper • 2604.12002 • Published 25 days ago • 11
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 73
Paused Agents 1 TESSY Reasoning Demo - Sudanese Arabic 🧠1 Analyze Sudanese Arabic samples with standard vs TESSY reasoning
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper • 2604.14164 • Published Mar 23 • 35
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper • 2604.14164 • Published Mar 23 • 35
How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data Paper • 2604.14164 • Published Mar 23 • 35
Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization Paper • 2603.28342 • Published Mar 30 • 26
Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization Paper • 2603.28342 • Published Mar 30 • 26
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published Mar 26 • 132