Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family Paper • 2504.18225 • Published Apr 25, 2025 • 15
view article Article Releasing the largest multilingual open pretraining dataset Pclanglais • Nov 13, 2024 • 107
view article Article Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing Pclanglais • Jul 19, 2024 • 20
OpenCulture Collection A multilingual dataset of public domain books and newspapers. • 25 items • Updated Mar 2 • 134