Have Seen Me Before? Automating Dataset Updates Towards Reliable and Timely Evaluation Paper • 2402.11894 • Published Feb 19, 2024
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14, 2025 • 57
Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model Paper • 2503.11251 • Published Mar 14, 2025 • 1
STORYANCHORS: Generating Consistent Multi-Scene Story Frames for Long-Form Narratives Paper • 2505.08350 • Published May 13, 2025
AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning Paper • 2410.13181 • Published Oct 17, 2024 • 1
SetPO: Set-Level Policy Optimization for Diversity-Preserving LLM Reasoning Paper • 2602.01062 • Published Feb 1 • 2
EXCEEDS: Extracting Complex Events via Nugget-based Grid Modeling in Scientific Domain Paper • 2406.14075 • Published 20 days ago
Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation Paper • 2605.04128 • Published 9 days ago • 17
TextLDM: Language Modeling with Continuous Latent Diffusion Paper • 2605.07748 • Published 6 days ago • 22
StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians Paper • 2504.15281 • Published Apr 21, 2025 • 23
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8, 2025 • 186
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding Paper • 2503.14935 • Published Mar 19, 2025
Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model Paper • 2503.11251 • Published Mar 14, 2025 • 1
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent Paper • 2502.03207 • Published Feb 5, 2025 • 1
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14, 2025 • 57
MikuDance: Animating Character Art with Mixed Motion Dynamics Paper • 2411.08656 • Published Nov 13, 2024
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D Paper • 2411.02336 • Published Nov 4, 2024 • 24
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models Paper • 2405.20853 • Published May 31, 2024 • 1
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation Paper • 2306.17115 • Published Jun 29, 2023 • 12
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models Paper • 2312.13913 • Published Dec 21, 2023 • 24