Mindstorms in Natural Language-Based Societies of Mind Paper • 2305.17066 • Published May 26, 2023 • 3
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models Paper • 2404.02747 • Published Apr 3, 2024 • 13
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale Paper • 2410.20280 • Published Oct 26, 2024 • 23
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer Paper • 2509.24695 • Published Sep 29 • 45
Mindstorms in Natural Language-Based Societies of Mind Paper • 2305.17066 • Published May 26, 2023 • 3
Mindstorms in Natural Language-Based Societies of Mind Paper • 2305.17066 • Published May 26, 2023 • 3
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention Paper • 2312.07987 • Published Dec 13, 2023 • 41
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing Paper • 2505.00315 • Published May 1 • 1
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine Paper • 2510.21614 • Published Oct 24 • 22
Huxley-Gödel Machine: Human-Level Coding Agent Development by an Approximation of the Optimal Self-Improving Machine Paper • 2510.21614 • Published Oct 24 • 22
Infinite Sampling: Efficient and Stable Grouped RL Training for Large Language Models Paper • 2506.22950 • Published Jun 28
FlashDP: Private Training Large Language Models with Efficient DP-SGD Paper • 2507.01154 • Published Jul 1
DistZO2: High-Throughput and Memory-Efficient Zeroth-Order Fine-tuning LLMs with Distributed Parallel Computing Paper • 2507.03211 • Published Jul 3
When Thinking Backfires: Mechanistic Insights Into Reasoning-Induced Misalignment Paper • 2509.00544 • Published Aug 30 • 11
PartEdit: Fine-Grained Image Editing using Pre-Trained Diffusion Models Paper • 2502.04050 • Published Feb 6 • 1
Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation Paper • 2509.21989 • Published Sep 26 • 22