MediaTech Collection Collection of public datasets from the French administration, chunked, vectorized and ready to use in AI projects. • 9 items • Updated Feb 4 • 6
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated Dec 2, 2025 • 158
Common Models Collection The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 42
Pleias-RAG Collection New generation of small reasoning models for RAG, search, and source summarization. • 4 items • Updated Apr 24, 2025 • 29
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare +1 Apr 19, 2024 • 195
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 12 items • Updated Dec 23, 2025 • 150