OceanPile: A Large-Scale Multimodal Ocean Corpus for Foundation Models Paper • 2605.00877 • Published 13 days ago • 13
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis Paper • 2604.24198 • Published 11 days ago • 21
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language Paper • 2604.19667 • Published 17 days ago • 22
LightThinker++: From Reasoning Compression to Memory Management Paper • 2604.03679 • Published Apr 4 • 38
SkillX: Automatically Constructing Skill Knowledge Bases for Agents Paper • 2604.04804 • Published Apr 6 • 33
How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities Paper • 2603.02578 • Published Mar 3 • 25
view article Article Making LLMs Truly Remember You | LightMem: Lightweight and Efficient Memory-Augmented Generation Feb 28 • 4
view article Article Create, Evaluate, and Connect AI Skills | SkillNet: A Large-Scale Agentic "Skill Graph" Knowledge Base Feb 28 • 13
InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem Paper • 2602.14367 • Published Feb 16 • 17
From Data to Behavior: Predicting Unintended Model Behaviors Before Training Paper • 2602.04735 • Published Feb 4 • 15
Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics Paper • 2602.02343 • Published Feb 2 • 13
Aligning Agentic World Models via Knowledgeable Experience Learning Paper • 2601.13247 • Published Jan 19 • 15
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency Paper • 2601.05905 • Published Jan 9 • 20