Common Pile v0.1 Collection All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 40
Tiny Series Collection Tiny datasets that empower the foundation of Small Language Model! • 14 items • Updated 8 days ago • 44