Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
rain2sun
's Collections
Benchmark
NLP
RL-Datasets
Distilled
Math-Code-Reason
Code-IFT-Datasets
Open-LLM
High-Quality-Datasets
Pretrain-Datasets
IFT-Datasets
High-Quality-Datasets
updated
Dec 2, 2024
高质量数据集,包含高密度的知识
Upvote
-
wikimedia/wikipedia
Viewer
•
Updated
Jan 9, 2024
•
61.6M
•
67.8k
•
996
OpenCoder-LLM/opc-annealing-corpus
Viewer
•
Updated
May 29
•
15.6M
•
7.93k
•
41
hltcoe/megawika
Updated
Jan 31
•
39.3k
•
41
allenai/dolmino-mix-1124
Viewer
•
Updated
Oct 29
•
170M
•
46.1k
•
88
Upvote
-
Share collection
View history
Collection guide
Browse collections