ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks Paper • 2603.27862 • Published Mar 29 • 31
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 9 days ago • 91
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 9 days ago • 91
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 9 days ago • 91
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 96
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 96
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 96
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published Oct 7, 2025 • 111
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published Oct 7, 2025 • 111
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26, 2025 • 26
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 81
Internal and External Impacts of Natural Language Processing Papers Paper • 2505.16061 • Published May 21, 2025
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery Paper • 2406.10833 • Published Jun 16, 2024
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Paper • 2505.20139 • Published May 26, 2025 • 19
TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs Paper • 2406.10310 • Published Jun 14, 2024 • 2
VideoEval-Pro: Robust and Realistic Long Video Understanding Evaluation Paper • 2505.14640 • Published May 20, 2025 • 16
Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations Paper • 2310.07849 • Published Oct 11, 2023 • 2
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding Paper • 2305.14232 • Published May 23, 2023