Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
HenriLD 's Collections
smolrx-135M
Dataset Mix for Pre-Training SLMs

Dataset Mix for Pre-Training SLMs

updated Mar 25, 2025
Upvote
2

  • open-thoughts/OpenThoughts-114k

    Viewer • Updated Aug 31, 2025 • 228k • 143k • 833

  • open-r1/OpenThoughts-114k-math

    Viewer • Updated Jan 30, 2025 • 89.1k • 811 • 92

  • HuggingFaceFW/fineweb

    Viewer • Updated Jul 11, 2025 • 52.5B • 650k • 2.77k

  • FreedomIntelligence/medical-o1-reasoning-SFT

    Viewer • Updated Apr 22, 2025 • 90.1k • 7.44k • 1.09k

  • AI-MO/NuminaMath-CoT

    Viewer • Updated Nov 25, 2024 • 860k • 36.7k • 568

  • dmariko/init_data

    Viewer • Updated Jul 10, 2024 • 188k • 20

  • HenriLD/FDA_Docs

    Viewer • Updated Feb 12, 2025 • 30.4k • 5

  • ChayanM/MIMIC-Impression-Dataset

    Viewer • Updated Apr 28, 2024 • 292k • 32 • 2

  • allenai/cord19

    Updated Nov 3, 2022 • 377 • 8

  • MedRAG/pubmed

    Viewer • Updated Feb 27, 2024 • 2.21M • 10.5k • 99

  • EleutherAI/SmolLM2-135M-10B

    Viewer • Updated Apr 15, 2025 • 10.1M • 2.47k • 1
Upvote
2
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs