Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers Paper • 2604.17632 • Published 17 days ago • 11
Dual-View Training for Instruction-Following Information Retrieval Paper • 2604.18845 • Published 16 days ago • 10
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems Paper • 2601.11004 • Published Jan 16 • 30
The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents Paper • 2601.07264 • Published Jan 12 • 24
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction Paper • 2512.18880 • Published Dec 21, 2025 • 25
Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures Paper • 2510.24081 • Published Oct 28, 2025 • 22
Good Intentions Beyond ACL: Who Does NLP for Social Good, and Where? Paper • 2510.04434 • Published Oct 6, 2025 • 6
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29, 2025 • 148
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26, 2025 • 137
The Majority is not always right: RL training for solution aggregation Paper • 2509.06870 • Published Sep 8, 2025 • 15
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 189
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1, 2025 • 96
Phi-Ground Tech Report: Advancing Perception in GUI Grounding Paper • 2507.23779 • Published Jul 31, 2025 • 46
Diversity-Enhanced Reasoning for Subjective Questions Paper • 2507.20187 • Published Jul 27, 2025 • 26
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 • 222
The Invisible Leash: Why RLVR May Not Escape Its Origin Paper • 2507.14843 • Published Jul 20, 2025 • 85