Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published 24 days ago • 52
Over++: Generative Video Compositing for Layer Interaction Effects Paper • 2512.19661 • Published Dec 22, 2025 • 12
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 506
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 166
FlashWorld: High-quality 3D Scene Generation within Seconds Paper • 2510.13678 • Published Oct 15, 2025 • 73
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3, 2025 • 223
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published Jan 9, 2025 • 95
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos Paper • 2501.04001 • Published Jan 7, 2025 • 47
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published Jan 7, 2025 • 82
OminiControl: Minimal and Universal Control for Diffusion Transformer Paper • 2411.15098 • Published Nov 22, 2024 • 61
MyTimeMachine: Personalized Facial Age Transformation Paper • 2411.14521 • Published Nov 21, 2024 • 23