SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence Paper • 2512.22334 • Published Dec 26, 2025 • 35
COMPASS: A Framework for Evaluating Organization-Specific Policy Alignment in LLMs Paper • 2601.01836 • Published 23 days ago • 10
AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents Paper • 2512.23343 • Published about 1 month ago • 28
Diversity or Precision? A Deep Dive into Next Token Prediction Paper • 2512.22955 • Published Dec 28, 2025 • 8
Confidence Estimation for LLMs in Multi-turn Interactions Paper • 2601.02179 • Published 23 days ago • 16
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process Paper • 2512.23988 • Published 30 days ago • 16
Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published 28 days ago • 41
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling Paper • 2512.23959 • Published 30 days ago • 109
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 28 days ago • 103
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published 29 days ago • 61
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits Paper • 2512.20578 • Published Dec 23, 2025 • 83
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models Paper • 2506.01413 • Published Jun 2, 2025 • 16
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs Paper • 2505.24858 • Published May 30, 2025 • 17