ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics Paper • 2606.10479 • Published 17 days ago • 19
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation Paper • 2605.31264 • Published 28 days ago • 118
AgentDoG1.5 Collection A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security • 11 items • Updated 5 days ago • 13
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security Paper • 2605.29801 • Published 29 days ago • 144
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security Paper • 2605.29801 • Published 29 days ago • 144
AgentDoG1.5 Collection A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security • 11 items • Updated 5 days ago • 13
A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook Paper • 2605.20266 • Published May 18 • 56
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 328
ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety Paper • 2604.02022 • Published Apr 2 • 15
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published Mar 16 • 150
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment Paper • 2504.15585 • Published Apr 22, 2025 • 14
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30, 2025 • 90
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond Paper • 2503.21614 • Published Mar 27, 2025 • 43
Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues Paper • 2410.10700 • Published Oct 14, 2024 • 3
Refining Alignment Framework for Diffusion Models with Intermediate-Step Preference Ranking Paper • 2502.01667 • Published Feb 1, 2025