Dataset Español (Traduccion) qanastek/WMT-16-PubMed Viewer • Updated Oct 22, 2022 • 974k • 33 • 6 yhavinga/ccmatrix Updated Mar 14, 2024 • 2.43k • 25 somosnlp-hackathon-2022/spanish-to-quechua Viewer • Updated Oct 25, 2022 • 128k • 54 • 12 bible-nlp/biblenlp-corpus Updated Dec 5, 2024 • 159 • 37
Datasets Español Coleccion de Datasets en español para el proyecto de awesome datasets CohereLabs/aya_collection Viewer • Updated Apr 15, 2025 • 514M • 2.86k • 230 CohereLabs/aya_collection_language_split Viewer • Updated Apr 15, 2025 • 514M • 2.52k • 112 allenai/c4 Viewer • Updated Jan 9, 2024 • 10.4B • 494k • 513 togethercomputer/RedPajama-Data-V2 Updated Nov 21, 2024 • 2.25k • 393
Datasets Español (Proposito) Helsinki-NLP/opus_books Viewer • Updated Mar 29, 2024 • 1.25M • 13.1k • 86 stanford-oval/ccnews Viewer • Updated Aug 31, 2024 • 893M • 3.65k • 32 openai/MMMLU Viewer • Updated Oct 16, 2024 • 393k • 10.7k • 514 qanastek/ELRC-Medical-V2 Viewer • Updated Oct 24, 2022 • 278k • 2.79k • 16
Datasets Español (Audio) ylacombe/google-chilean-spanish Viewer • Updated Nov 27, 2023 • 4.37k • 183 • 20 FBK-MT/Speech-MASSIVE Viewer • Updated Oct 7, 2025 • 97.6k • 1.16k • 47 facebook/voxpopuli Viewer • Updated 7 days ago • 1.26M • 7.4k • 142 speechbrain/common_language Updated Jun 12, 2023 • 663 • 43
Dataset Español (Traduccion) qanastek/WMT-16-PubMed Viewer • Updated Oct 22, 2022 • 974k • 33 • 6 yhavinga/ccmatrix Updated Mar 14, 2024 • 2.43k • 25 somosnlp-hackathon-2022/spanish-to-quechua Viewer • Updated Oct 25, 2022 • 128k • 54 • 12 bible-nlp/biblenlp-corpus Updated Dec 5, 2024 • 159 • 37
Datasets Español (Proposito) Helsinki-NLP/opus_books Viewer • Updated Mar 29, 2024 • 1.25M • 13.1k • 86 stanford-oval/ccnews Viewer • Updated Aug 31, 2024 • 893M • 3.65k • 32 openai/MMMLU Viewer • Updated Oct 16, 2024 • 393k • 10.7k • 514 qanastek/ELRC-Medical-V2 Viewer • Updated Oct 24, 2022 • 278k • 2.79k • 16
Datasets Español (Audio) ylacombe/google-chilean-spanish Viewer • Updated Nov 27, 2023 • 4.37k • 183 • 20 FBK-MT/Speech-MASSIVE Viewer • Updated Oct 7, 2025 • 97.6k • 1.16k • 47 facebook/voxpopuli Viewer • Updated 7 days ago • 1.26M • 7.4k • 142 speechbrain/common_language Updated Jun 12, 2023 • 663 • 43
Datasets Español Coleccion de Datasets en español para el proyecto de awesome datasets CohereLabs/aya_collection Viewer • Updated Apr 15, 2025 • 514M • 2.86k • 230 CohereLabs/aya_collection_language_split Viewer • Updated Apr 15, 2025 • 514M • 2.52k • 112 allenai/c4 Viewer • Updated Jan 9, 2024 • 10.4B • 494k • 513 togethercomputer/RedPajama-Data-V2 Updated Nov 21, 2024 • 2.25k • 393