Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
casey-martin 's Collections
Agent Trajectories
Quality Code Annealing
High Quality Reasoning Datasets
Subject-Matter-Expertise

Subject-Matter-Expertise

updated Oct 17, 2025

High quality pretraining and instruction datasets for law, mathematics, and science.

Upvote
-

  • pile-of-law/pile-of-law

    Updated Jan 8, 2023 • 3.31k • 273

  • EleutherAI/proof-pile-2

    Updated Oct 25, 2023 • 9.25k • 223

  • gabrielaltay/pubtator-central-bigbio-kb-2022-12-18

    Viewer • Updated Jan 7, 2023 • 35.1M • 586 • 1

  • bigcode/the-stack-v2-train-smol-ids

    Viewer • Updated Apr 23, 2024 • 40.1M • 2.33k • 49

  • allenai/SciRIFF

    Viewer • Updated Jun 13, 2024 • 433k • 184 • 48

  • zjunlp/Mol-Instructions

    Updated Mar 3, 2024 • 1.51k • 66

  • AI-MO/NuminaMath-CoT

    Viewer • Updated Nov 25, 2024 • 860k • 38.8k • 569

  • AI-MO/NuminaMath-TIR

    Viewer • Updated Nov 25, 2024 • 72.5k • 4.5k • 150

  • Team-ACE/ToolACE

    Viewer • Updated Sep 4, 2024 • 11.3k • 2.41k • 174

    Note Function calling


  • NousResearch/hermes-function-calling-v1

    Viewer • Updated Jan 3 • 11.6k • 8.95k • 403

    Note Function calling


  • Salesforce/xlam-function-calling-60k

    Viewer • Updated Jan 24, 2025 • 60k • 9.84k • 605

    Note Function calling


  • trendmicro-ailab/Primus-FineWeb

    Viewer • Updated Aug 9, 2025 • 3.39M • 788 • 20
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs