Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
whr94621 's Collections
multilingual_benchmark
multilingual_domain_datasets
LLM_LongContext
LLM_Eval
LLM_Alignment
LLM_Pretrain
LLM_Multilingual
llm_datasets_japanese
llm_datasets_multi
llm_datasets_arabic
llm_synthesis_data
llm_datasets_id
llm_datasets_translation
llm_models_pretrain
llm_datasets_korean
llm_datasets_vi
llm_datasets_ru
llm_datasets_th
curated_sft_data

multilingual_domain_datasets

updated Feb 17, 2025

Multilingual datasets. Excluding those which are just a cleaned version of CC.

Upvote
-

  • nyuuzyou/edutexts

    Viewer • Updated Jan 14 • 1.38M • 67 • 5

  • llm-jp/AnswerCarefully

    Viewer • Updated Feb 16 • 4.62k • 412 • 53

  • sander-wood/m4-rag

    Viewer • Updated Oct 12, 2025 • 1.04M • 14 • 13
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs