452 GB
85 files
Updated about 6 hours ago
NameSize
data
.gitattributes2.31 kB
xet
README.md493 Bytes
xet
README.md

Since The Pile was removed from the original site, Blue Jerry is the deduplicated pre-training data from the Eluther AI's Pile, which is the high quality data to train models as in pre-training stage

Total size
452 GB
Files
85
Last updated
May 25
Pre-warmed CDN
US EU US EU

Contributors