Measuring the Depth of LLM Unlearning via Activation Patching Paper • 2605.24614 • Published 12 days ago • 7
NSF-SciFy: Mining the NSF Awards Database for Scientific Claims Paper • 2503.08600 • Published 10 days ago • 4
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 8 days ago • 419
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation Paper • 2605.23271 • Published 13 days ago • 79
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 22 days ago • 270
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published Mar 26 • 155
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 248