MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published Feb 9 • 156
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development Paper • 2601.11077 • Published Jan 16 • 65
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published Jan 4 • 57
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL Paper • 2505.24875 • Published May 30, 2025 • 10
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning Paper • 2503.10480 • Published Mar 13, 2025 • 56
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models Paper • 2410.02416 • Published Oct 3, 2024 • 34
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20, 2025 • 160