Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 3k • 170 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 2.49k • 171 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 516 • 36 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16, 2025 • 410k • 77 • 10
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.7k • 157 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 16 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.7k • 157
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 16 • 5
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4, 2025 • 1.15k • 29 • 5 Conard/fortune-telling Viewer • Updated Feb 17, 2025 • 207 • 356 • 166
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • 33B • Updated Apr 8, 2025 • 2.4k • 115
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 29 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6, 2025 • 43 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 28.1k • • 203
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 28.1k • • 203
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 5.22k • 641 open-r1/codeforces-cots Viewer • Updated Mar 28, 2025 • 254k • 2.19k • 200 nvidia/OpenCodeReasoning Viewer • Updated May 4, 2025 • 753k • 3.54k • 521 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17, 2025 • 2.16M • 1.45k • 49
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18, 2025 • 1.79M • 5.39k • 141 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 14.9k • 224 nvidia/OpenMathReasoning Viewer • Updated May 27, 2025 • 5.68M • 13.8k • 391 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22, 2025 • 719k • 515 • 17
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 6.25k • 380 bytedance-research/Web-Bench Viewer • Updated May 19, 2025 • 1k • 502 • 8 luzimu/WebGen-Bench Viewer • Updated Sep 29, 2025 • 6.77k • 160 • 1
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • 15B • Updated May 11, 2025 • 609 • • 681 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12, 2025 • 373k • • 1.97k
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 29 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6, 2025 • 43 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 28.1k • • 203
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 28.1k • • 203
Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 3k • 170 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 2.49k • 171 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 516 • 36 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16, 2025 • 410k • 77 • 10
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 5.22k • 641 open-r1/codeforces-cots Viewer • Updated Mar 28, 2025 • 254k • 2.19k • 200 nvidia/OpenCodeReasoning Viewer • Updated May 4, 2025 • 753k • 3.54k • 521 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17, 2025 • 2.16M • 1.45k • 49
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.7k • 157 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 16 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.7k • 157
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 16 • 5
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18, 2025 • 1.79M • 5.39k • 141 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 14.9k • 224 nvidia/OpenMathReasoning Viewer • Updated May 27, 2025 • 5.68M • 13.8k • 391 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22, 2025 • 719k • 515 • 17
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 6.25k • 380 bytedance-research/Web-Bench Viewer • Updated May 19, 2025 • 1k • 502 • 8 luzimu/WebGen-Bench Viewer • Updated Sep 29, 2025 • 6.77k • 160 • 1
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4, 2025 • 1.15k • 29 • 5 Conard/fortune-telling Viewer • Updated Feb 17, 2025 • 207 • 356 • 166
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • 15B • Updated May 11, 2025 • 609 • • 681 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12, 2025 • 373k • • 1.97k
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • 33B • Updated Apr 8, 2025 • 2.4k • 115