Composing Concepts from Images and Videos via Concept-prompt Binding Paper โข 2512.09824 โข Published Dec 10, 2025 โข 28
MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment Paper โข 2512.06628 โข Published Dec 7, 2025 โข 13
AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement Paper โข 2511.23475 โข Published Nov 28, 2025 โข 43
Hyper-Bagel: A Unified Acceleration Framework for Multimodal Understanding and Generation Paper โข 2509.18824 โข Published Sep 23, 2025 โข 23
pyannote/speaker-diarization-3.1 Automatic Speech Recognition โข Updated May 10, 2024 โข 13.8M โข 1.47k
deepseek-ai/DeepSeek-Prover-V2-671B Text Generation โข 685B โข Updated Apr 30, 2025 โข 322 โข โข 817