Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
whr94621 's Collections
multilingual_benchmark
multilingual_domain_datasets
LLM_LongContext
LLM_Eval
LLM_Alignment
LLM_Pretrain
LLM_Multilingual
llm_datasets_japanese
llm_datasets_multi
llm_datasets_arabic
llm_synthesis_data
llm_datasets_id
llm_datasets_translation
llm_models_pretrain
llm_datasets_korean
llm_datasets_vi
llm_datasets_ru
llm_datasets_th
curated_sft_data

multilingual_domain_datasets

updated Feb 17, 2025

Multilingual datasets. Excluding those which are just a cleaned version of CC.

Upvote
-

  • nyuuzyou/edutexts

    Viewer • Updated 22 days ago • 1.38M • 146 • 5

  • llm-jp/AnswerCarefully

    Preview • Updated Dec 2, 2025 • 153 • 48

  • sander-wood/m4-rag

    Viewer • Updated Oct 12, 2025 • 1.04M • 14 • 13
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs