Have Seen Me Before? Automating Dataset Updates Towards Reliable and Timely Evaluation Paper • 2402.11894 • Published Feb 19, 2024
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published Feb 14, 2025 • 57
Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model Paper • 2503.11251 • Published Mar 14, 2025 • 1
STORYANCHORS: Generating Consistent Multi-Scene Story Frames for Long-Form Narratives Paper • 2505.08350 • Published May 13, 2025
AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning Paper • 2410.13181 • Published Oct 17, 2024 • 1
SetPO: Set-Level Policy Optimization for Diversity-Preserving LLM Reasoning Paper • 2602.01062 • Published Feb 1 • 1
EXCEEDS: Extracting Complex Events via Nugget-based Grid Modeling in Scientific Domain Paper • 2406.14075 • Published 18 days ago
Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation Paper • 2605.04128 • Published 7 days ago • 15
TextLDM: Language Modeling with Continuous Latent Diffusion Paper • 2605.07748 • Published 4 days ago • 20
Beyond Retrieval: A Multitask Benchmark and Model for Code Search Paper • 2605.04615 • Published 6 days ago • 22
TextLDM: Language Modeling with Continuous Latent Diffusion Paper • 2605.07748 • Published 4 days ago • 20
SetPO: Set-Level Policy Optimization for Diversity-Preserving LLM Reasoning Paper • 2602.01062 • Published Feb 1 • 1
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing Paper • 2604.04911 • Published Apr 6 • 36