VINS-120K: Ultra High-Resolution Image Editing with A Large-Scale Dataset Paper • 2605.23518 • Published May 22 • 8
PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset Paper • 2605.20147 • Published May 19 • 11
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published Apr 8 • 44
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27, 2025 • 87
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published Mar 30, 2025 • 94
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution Paper • 2501.02976 • Published Jan 6, 2025 • 56
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors Paper • 2412.11586 • Published Dec 16, 2024 • 11
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption Paper • 2412.09283 • Published Dec 12, 2024 • 19
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement Paper • 2411.06558 • Published Nov 10, 2024 • 36