SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 7 days ago • 43
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 5 days ago • 39
Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts Paper • 2601.03315 • Published Jan 6 • 6
OmniPSD: Layered PSD Generation with Diffusion Transformer Paper • 2512.09247 • Published Dec 10, 2025 • 48
RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards Paper • 2512.00473 • Published Nov 29, 2025 • 26
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows Paper • 2512.05150 • Published Dec 3, 2025 • 76
EmoVid: A Multimodal Emotion Video Dataset for Emotion-Centric Video Understanding and Generation Paper • 2511.11002 • Published Nov 14, 2025 • 4
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper • 2509.24006 • Published Sep 28, 2025 • 118
CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching Paper • 2509.19300 • Published Sep 23, 2025 • 7
Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe Paper • 2508.01691 • Published Aug 3, 2025 • 10
EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection Paper • 2506.09827 • Published Jun 11, 2025 • 21
Natural Language Supervision for General-Purpose Audio Representations Paper • 2309.05767 • Published Sep 11, 2023 • 9
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining Paper • 2308.05734 • Published Aug 10, 2023 • 38
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 250