PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution Paper • 2601.10657 • Published 15 days ago • 20
PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution Paper • 2601.10657 • Published 15 days ago • 20
One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling Paper • 2601.03111 • Published 24 days ago • 9
FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in Large Language Models Paper • 2407.01046 • Published Jul 1, 2024
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning Paper • 2505.16421 • Published May 22, 2025 • 19
Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models Paper • 2505.16265 • Published May 22, 2025 • 8
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor Paper • 2506.07932 • Published Jun 9, 2025 • 12
Agents of Change: Self-Evolving LLM Agents for Strategic Planning Paper • 2506.04651 • Published Jun 5, 2025 • 8
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5, 2025 • 60
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation Paper • 2504.07072 • Published Apr 9, 2025 • 9
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation Paper • 2504.07072 • Published Apr 9, 2025 • 9
Can Vision-Language Models Answer Face to Face Questions in the Real-World? Paper • 2503.19356 • Published Mar 25, 2025 • 2
Knowledge of Knowledge: Exploring Known-Unknowns Uncertainty with Large Language Models Paper • 2305.13712 • Published May 23, 2023 • 2
Game-theoretic LLM: Agent Workflow for Negotiation Games Paper • 2411.05990 • Published Nov 8, 2024 • 8
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge Paper • 2411.19799 • Published Nov 29, 2024 • 16
DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics Paper • 2407.06426 • Published Jul 8, 2024 • 1
MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate Paper • 2406.14711 • Published Jun 20, 2024 • 1