Towards Automated Kernel Generation in the Era of LLMs Paper • 2601.15727 • Published 5 days ago • 16
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces Paper • 2601.11868 • Published 10 days ago • 29
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems Paper • 2601.11004 • Published 11 days ago • 29
view article Article LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family 8 days ago • 64
view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG 12 days ago • 59
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs Paper • 2601.11000 • Published 11 days ago • 26
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published 14 days ago • 141
sui-1: Grounded and Verifiable Long-Form Summarization Paper • 2601.08472 • Published 14 days ago • 2
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency Paper • 2601.05905 • Published 18 days ago • 18
💧 LFM2.5 Collection Collection of Instruct, Base, and Japanese LFM2.5-1.2B models. • 22 items • Updated about 11 hours ago • 80