Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis Paper • 2601.14253 • Published 13 days ago • 10
HeartMuLa: A Family of Open Sourced Music Foundation Models Paper • 2601.10547 • Published 18 days ago • 41
V-DPM: 4D Video Reconstruction with Dynamic Point Maps Paper • 2601.09499 • Published 19 days ago • 9
HeartMuLa: A Family of Open Sourced Music Foundation Models Paper • 2601.10547 • Published 18 days ago • 41
UM-Text: A Unified Multimodal Model for Image Understanding Paper • 2601.08321 • Published 20 days ago • 9
ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation Paper • 2601.03955 • Published 26 days ago • 3
FlowBlending: Stage-Aware Multi-Model Sampling for Fast and High-Fidelity Video Generation Paper • 2512.24724 • Published Dec 31, 2025 • 7
Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow Paper • 2512.24766 • Published Dec 31, 2025 • 9
What matters for Representation Alignment: Global Information or Spatial Structure? Paper • 2512.10794 • Published Dec 11, 2025 • 9
ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models Paper • 2512.07843 • Published Nov 24, 2025 • 22
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 38
ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting Paper • 2504.20630 • Published Apr 29, 2025 • 9
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head Paper • 2304.12995 • Published Apr 25, 2023
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and KiSing-v2 Paper • 2401.17619 • Published Jan 31, 2024 • 1
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction Paper • 2406.10911 • Published Jun 16, 2024
Muskits-ESPnet: A Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm Paper • 2409.07226 • Published Sep 11, 2024 • 1