Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
BEEspoke Data
community
https://www.bees.org/
Activity Feed
Follow
62
AI & ML interests
'an LLM is only as good as the dataset it was trained on' - Sun Tzu
Recent Activity
pszemraj
Β
updated
a model
19 days ago
BEE-spoke-data/NVIDIA-Nemotron-Parse-v1.2
pszemraj
Β
published
a model
22 days ago
BEE-spoke-data/NVIDIA-Nemotron-Parse-v1.2
kenhktsui
Β
authored
a paper
5 months ago
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources
View all activity
Team members
9
BEE-spoke-data
's datasets
82
Sort:Β Recently updated
BEE-spoke-data/peS2o-100k_en-xlong
Viewer
β’
Updated
Dec 29, 2025
β’
100k
β’
8
BEE-spoke-data/falcon-refinedweb-100k_en-xlong
Viewer
β’
Updated
Dec 29, 2025
β’
100k
β’
5
BEE-spoke-data/falcon-refinedweb-100k_en-long
Viewer
β’
Updated
Dec 29, 2025
β’
100k
β’
56
β’
4
BEE-spoke-data/falcon-refinedweb-1M_en_medium
Viewer
β’
Updated
Dec 29, 2025
β’
1M
β’
27
β’
1
BEE-spoke-data/falcon-refinedweb-100k_en_med-sample
Viewer
β’
Updated
Dec 29, 2025
β’
200k
β’
5
BEE-spoke-data/scientificbeekeeping
Viewer
β’
Updated
Dec 29, 2025
β’
471
β’
6
BEE-spoke-data/govdocs1-image
Viewer
β’
Updated
Dec 29, 2025
β’
199k
β’
49
BEE-spoke-data/medium-articles-en
Viewer
β’
Updated
Dec 29, 2025
β’
180k
β’
10
β’
2
BEE-spoke-data/govdocs1-txt-raw
Viewer
β’
Updated
Dec 29, 2025
β’
75.5k
β’
11
BEE-spoke-data/govdocs1-by-extension
Viewer
β’
Updated
Dec 29, 2025
β’
733k
β’
277
β’
2
BEE-spoke-data/code-tutorials-en
Viewer
β’
Updated
Dec 29, 2025
β’
620k
β’
26
β’
1
BEE-spoke-data/pile-python-filtered
Viewer
β’
Updated
Dec 29, 2025
β’
841k
β’
9
β’
1
BEE-spoke-data/code_contests_instruct
Viewer
β’
Updated
Dec 29, 2025
β’
12.2M
β’
170
β’
6
BEE-spoke-data/the-stack-smol-xl-readable
Viewer
β’
Updated
Dec 29, 2025
β’
424k
β’
12
β’
1
BEE-spoke-data/rp_books-en
Viewer
β’
Updated
Dec 29, 2025
β’
120k
β’
14
β’
1
BEE-spoke-data/Long-Data-Col-rp_pile_pretrain
Viewer
β’
Updated
Dec 29, 2025
β’
10.9M
β’
32
β’
2
BEE-spoke-data/wikipedia-20230901.en-deduped
Viewer
β’
Updated
Dec 29, 2025
β’
11.9M
β’
291
β’
6
BEE-spoke-data/open-web-math-minhash
Viewer
β’
Updated
Dec 29, 2025
β’
3.64M
β’
9
BEE-spoke-data/coedit-reworded-deduped
Updated
Dec 29, 2025
β’
20
β’
5
BEE-spoke-data/SYSK-Transcripts
Viewer
β’
Updated
Dec 29, 2025
β’
5.79k
β’
6
BEE-spoke-data/bees-internal
Viewer
β’
Updated
Dec 29, 2025
β’
4.08k
β’
18
β’
7
BEE-spoke-data/bees-v0
Viewer
β’
Updated
Dec 29, 2025
β’
48.6k
β’
11
β’
1
Previous
1
2
3
Next