Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published Dec 2, 2025 • 53
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29, 2025 • 143
Large Reasoning Models Learn Better Alignment from Flawed Thinking Paper • 2510.00938 • Published Oct 1, 2025 • 59
LiveTradeBench: Seeking Real-World Alpha with Large Language Models Paper • 2511.03628 • Published Nov 5, 2025 • 13
PromptBridge: Cross-Model Prompt Transfer for Large Language Models Paper • 2512.01420 • Published Dec 1, 2025 • 10
Dyna-Mind: Learning to Simulate from Experience for Better AI Agents Paper • 2510.09577 • Published Oct 10, 2025 • 8
Diversity Has Always Been There in Your Visual Autoregressive Models Paper • 2511.17074 • Published Nov 21, 2025 • 8
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17, 2025 • 136
Search Self-play: Pushing the Frontier of Agent Capability without Supervision Paper • 2510.18821 • Published Oct 21, 2025 • 18
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning Paper • 2510.03259 • Published Sep 26, 2025 • 57
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published Oct 22, 2025 • 115
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 109
Reactive Transformer (RxT) -- Stateful Real-Time Processing for Event-Driven Reactive Language Models Paper • 2510.03561 • Published Oct 3, 2025 • 25
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10, 2025 • 57
MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique Paper • 2511.09067 • Published Nov 12, 2025 • 2
Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning Paper • 2510.23038 • Published Oct 27, 2025 • 1
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning Paper • 2511.06805 • Published Nov 10, 2025 • 13
JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation Paper • 2511.15958 • Published Nov 20, 2025 • 1
VeriSciQA: An Auto-Verified Dataset for Scientific Visual Question Answering Paper • 2511.19899 • Published Nov 25, 2025
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows Paper • 2512.05150 • Published Dec 3, 2025 • 75
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling Paper • 2512.03000 • Published Dec 2, 2025 • 37
Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion Paper • 2512.04926 • Published Dec 4, 2025 • 42
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published Dec 8, 2025 • 77
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Paper • 2512.13586 • Published Dec 15, 2025 • 92
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 84
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published Dec 19, 2025 • 27
HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices Paper • 2512.14052 • Published Dec 16, 2025 • 42
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion Paper • 2512.19535 • Published Dec 22, 2025 • 12
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published Dec 23, 2025 • 54
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published Dec 24, 2025 • 13
TimeBill: Time-Budgeted Inference for Large Language Models Paper • 2512.21859 • Published Dec 26, 2025 • 25
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents Paper • 2512.22322 • Published Dec 26, 2025 • 39
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 29 days ago • 143
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision Paper • 2601.03193 • Published 23 days ago • 46
Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models Paper • 2601.01321 • Published 25 days ago • 18
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs Paper • 2601.17058 • Published 7 days ago • 142