DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation Paper • 2602.12160 • Published 25 days ago • 38
deepseek-ai/DeepSeek-R1-0528 Text Generation • 685B • Updated May 29, 2025 • 1.09M • • 2.4k
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning? Paper • 2505.23359 • Published May 29, 2025 • 38
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning Paper • 2504.14509 • Published Apr 20, 2025 • 53
DreamO: A Unified Framework for Image Customization Paper • 2504.16915 • Published Apr 23, 2025 • 24
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 441
Region-Adaptive Sampling for Diffusion Transformers Paper • 2502.10389 • Published Feb 14, 2025 • 53
ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features Paper • 2502.04320 • Published Feb 6, 2025 • 36
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models Paper • 2501.06751 • Published Jan 12, 2025 • 32
Running on Zero Featured 2.07k PuLID-FLUX 🤗 2.07k Generate custom images from text and a reference photo