Arabic Speech Datasets Collection Best Datasets for Arabic Speech Tasks • 16 items • Updated 25 days ago • 15
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring Paper • 2501.02045 • Published Jan 3, 2025 • 22
Deep Ignorance Collection This collection contains the model and data artifacts from O'Brien et al. (2025). https://deepignorance.ai • 44 items • Updated Dec 17, 2025 • 10
Dayhoff Atlas Collection The models and datasets that comprise the Dayhoff Atlas • 10 items • Updated Jul 28, 2025 • 10
TabArena: A Living Benchmark for Machine Learning on Tabular Data Paper • 2506.16791 • Published Jun 20, 2025 • 3
view article Article Accelerating AI for Drug Discovery: Ginkgo’s GDPx Functional Genomics and GDPa Antibody Developability Dataset Series Jun 24, 2025 • 18
Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA Paper • 2505.21115 • Published May 27, 2025 • 140
Pretraining Language Models for Diachronic Linguistic Change Discovery Paper • 2504.05523 • Published Apr 7, 2025 • 5
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages Paper • 2503.20212 • Published Mar 26, 2025 • 8
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published Mar 12, 2025 • 75
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10, 2025 • 101
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs Paper • 2502.17424 • Published Feb 24, 2025 • 4