How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities Paper • 2603.02578 • Published 10 days ago • 23
view article Article Making LLMs Truly Remember You | LightMem: Lightweight and Efficient Memory-Augmented Generation 13 days ago • 3
view article Article Create, Evaluate, and Connect AI Skills | SkillNet: A Large-Scale Agentic "Skill Graph" Knowledge Base 13 days ago • 12
InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem Paper • 2602.14367 • Published 25 days ago • 17
From Data to Behavior: Predicting Unintended Model Behaviors Before Training Paper • 2602.04735 • Published Feb 4 • 15
Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics Paper • 2602.02343 • Published Feb 2 • 13
Aligning Agentic World Models via Knowledgeable Experience Learning Paper • 2601.13247 • Published Jan 19 • 15
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency Paper • 2601.05905 • Published Jan 9 • 20
InnoGym: Benchmarking the Innovation Potential of AI Agents Paper • 2512.01822 • Published Dec 1, 2025 • 36
Memory Collection Prompt is text-based memory. System II prompting is updating memory. Parametric memory is long-term, while prompt-based are short-tem. • 23 items • Updated Oct 22, 2025 • 2
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published Oct 21, 2025 • 114
Executable Knowledge Graphs for Replicating AI Research Paper • 2510.17795 • Published Oct 20, 2025 • 15