TSRBench: A Comprehensive Multi-task Multi-modal Time Series Reasoning Benchmark for Generalist Models Paper • 2601.18744 • Published 6 days ago • 10
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 26 days ago • 142
Pretraining Frame Preservation in Autoregressive Video Memory Compression Paper • 2512.23851 • Published Dec 29, 2025 • 24
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction Paper • 2512.18880 • Published Dec 21, 2025 • 25
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published Dec 19, 2025 • 28
Are We on the Right Way to Assessing LLM-as-a-Judge? Paper • 2512.16041 • Published Dec 17, 2025 • 34
Are We on the Right Way to Assessing LLM-as-a-Judge? Paper • 2512.16041 • Published Dec 17, 2025 • 34
FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering Paper • 2512.16670 • Published Dec 18, 2025 • 4
FrontierCS: Evolving Challenges for Evolving Intelligence Paper • 2512.15699 • Published Dec 17, 2025 • 5