Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12, 2025 • 77
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published Jan 9, 2025 • 98
Self-Directed Synthetic Dialogues and Revisions Technical Report Paper • 2407.18421 • Published Jul 25, 2024
Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion Paper • 2406.11196 • Published Jun 17, 2024 • 8
Data Governance in the Age of Large-Scale Data-Driven Language Technology Paper • 2206.03216 • Published May 4, 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 37
Habitat 2.0: Training Home Assistants to Rearrange their Habitat Paper • 2106.14405 • Published Jun 28, 2021
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Paper • 2303.03915 • Published Mar 7, 2023 • 7
InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models Paper • 2306.08757 • Published Jun 14, 2023
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI Paper • 2109.08238 • Published Sep 16, 2021
Simple and Effective Masked Diffusion Language Models Paper • 2406.07524 • Published Jun 11, 2024 • 12
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Paper • 2405.20541 • Published May 30, 2024 • 24
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Paper • 2405.20541 • Published May 30, 2024 • 24
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling Paper • 2403.03234 • Published Mar 5, 2024 • 14
Towards Characterizing Domain Counterfactuals For Invertible Latent Causal Models Paper • 2306.11281 • Published Jun 20, 2023
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments Paper • 2401.04290 • Published Jan 9, 2024 • 3